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EXCITATION VECTOR GENERATOR, SPEECH CODER AND SPEECH 
DECODER 

This is a contipifation of U.S. Patent Application No. 09/440,092, filed November 15, 
1999, pending, whkfo is a division of Application No. 09/101,186, filed June 6, 1998, pending, 
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Technical Field 

The present invention relates to an excitation 
vector generator capable of obtaining a high-quality 
synthesized speech, and a, speech coder and a speech 
decoder which can code and decode a high-quality 
speech signal at a low bit rate. 

Background Art 

A CELP (Code Excited Linear Prediction) type 
speech coder executes linear prediction for each of 
frames obtained by segmenting a speech at a given 
time, and codes predictive residuals (excitation 
signals) resulting from the f rame.-by-f rame lineal 
prediction, using an adaptive codebook having old 
* . excitation vectors stored therein and a random 

codebook which has a plurality of random code vectors 
stored therein. For instance, "Code-Excited Linear 
Prediction(CELP) : High-Quality Speech at Very Low Bit 
Rate," M . R. Schroeder, Proc. ICASSP '85, pp. 937-940 
discloses a CELP type speech coder. 

FIG. 1 illustrates the schematic structure of a 
CELP type speech coder. The CELP type speech coder 
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separates vocal information into excitation 
information and vocal tract information and codes 
them. With regard to the vocal tract information, an 
input speech signal 10 is input to a filter 
coefficients analysis section 11 for linear 
prediction and linear predictive coefficients (LPCs) 
are coded by a filter coefficients quantization 
section 12. Supplying the linear predictive 
coefficients to a synthesis filter 13 allows vocal 
tract information to be added to excitation 
information in the synthesis filter 13. With regard 
to the excitation information, excitation vector 
search in an adaptive codebook 14 and a random, 
codebook 15 is carried out for each segment obtained 
15 by further segmenting a frame (called subframe). The 
search in the adaptive codebook 14 and the search , in 
the random codebook 15 are processes of determining 
the code number and gain .(pitch gain) of an adaptive 
code vector, which minimizes coding distortion in an 
equation 1, and the code number and gain (random code 
gain) of a random code vector. 
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\\v-(gaHp + gcHc)\\ 2 (1) 
v: speech signal (vector) 
25 H: impulse response convolution matrix of the 
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synthesis filter. 

where h: impulse response (vector) of the synthesis 

filter 

L : frame length 



Because a closed loop search of the 'code that 
minimizes the equation 1 involves a vast amount of 
computation for the code search, however, an ordinary 
CELP type speech coder first performs adaptive 
codebook search to specify the code number of an 
adaptive code vector, and then executes random 
codebook search based on the searching result to 
specify the code number of a random code vector. 

The speech coder search by the CELP type speech 
coder will now be explained with reference to FIGS. 
2A through 2C. In the figures, a code x is a target 
vector for the random codebook search obtained by an 
equation 2.. It is assumed that the adaptive codebook 



p: adaptive code vector 



c: random code vector 



ga: adaptive code gain (pitch gain) 



gc : random code gain 



search has already been accomplished. 

x = v - gaHp (2) 

where x: target (vector) for the random codebook 
search 

v: speech signal (vector) 

H: impulse response convolution matrix H of the 
synthesis filter 

p: adaptive code vector 
ga: adaptive code gain (pitch gain) 
The random codebook search is a process of 
specifying a random code vector c which minimizes 
coding distortion that is defined by an equation 3 in 
a distortion calculator 16 as shown in TIG . 2A. 

\\x-gc\(t\\ 2 (3) 
where x: target (vector) for the random codebook 
search 

H: impulse response convolution matrix of the 
synthesis filter 

c: random code vector 
gc: random code gain. 

The distortion calculator 16 controls a control 
switch 21 to switch a random code vector to be read 
from the random codebook 15 until the random code 
vector c is specified. 
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An actual CELP type speech coder has a structure 
in FIG. 2B to reduce the computational complexities, 
and a distortion calculator 16' carries out a process 
of specifying a code number which maximizes a 
5 distortion measure in an equation 4. 



(x'Hc) 2 ((x'H)c) 2 (x^c) 2 (x"c) 2 
||Hc|| 2 = ||Hc|| 2 = ||Hc|| 2 c'H'Hc 

where x: target (vector) for the random codebook 
search 

10 H . impulse response convolution matrix of the 

synthesis filter 

H': transposed matrix of H 

X s : time reverse synthesis of x using H (x ,t: = 
x'H) • 

15 c: random code vector. 

Specifically, the random codebook control switch 
21 is connected to one terminal of the random 
codebook 15 and the random code vector c is read from 
an address corresponding to that terminal. The read 

20 random code vector o is synthesized with vocal tract 
information by the synthesis filter 13, producing a 
synthesized vector Ho. Then, the distortion 
calculator 16' computes a distortion measure in the 
equation 4 using a vector x' obtained by a time 

25 reverse process of a target x, the vector He 

resulting from synthesis of the random code vector in 



the synthesis filter and the random code vector c. 
As the random codebook control switch 21 is switched, 
computation of the distortion measure is performed 
for every random code vector in the random codebook. 

Finally, the number of the random codebook 
control switch 21 that had been connected when the 
distortion measure in the equation 4 became maximum 
is sent to a code output section 17 as the code 
number of the random code vector. 

FIG. 2C shows a partial structure of a speech 
decoder. The switching of the random codebook 
control switch 21 is controlled in such a way as to 
read out the random code vector that has a 
transmitted code number. After a transmitted random 
code gain gc and filter coefficient are set in an 
amplifier 23 and a synthesis filter 24, a random code 
vector is read out to restore a synthesized speech. 

In the above-described speech coder/speech 
decoder, the greater the number of random code 
vectors stored as excitation information in the 
random codebook 15 is, the more possible it is to 
search a random code vector close to the excitation 
vector of an actual speech. As the capacity of the 
random codebook (ROM) is limited, however, it is not 
possible to store countless random code vectors 
corresponding to all the excitation vectors in the 
random codebook. This restricts improvement on the 
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quality of speeches. 

Also has proposed an algebraic excitation which 
can significantly reduce the computational 
complexities of coding distortion in a distortion 
calculator and can eliminate a random codebook (ROM) 
(described in "8 KBIT/S ACELP CODING OF SPEECH WITH 
10 MS SPEECH-FRAME: A CANDIDATE FOR CCITT 
STANDARDIZATION": R. Salami, C. Laflamme, J-P. Adoul. 
ICASSP '94, pp. 11-97 to 11-100, 1994). 

The algebraic excitation considerably reduces the 
complexities of computation of coding distortion by 
previously computing the results of convolution of 
the impulse response of a synthesis filter and a. 
time-reversed target and the autocorrelation of the 
synthesis filter and developing' them in a memory. 
Further, a ROM in which random code vectors have been 
stored is eliminated by algebraically generating 
random code vectors. A CS-ACELP and ACELP which use 
the algebraic excitation have been recommended 
respectively as G. 729 and G. 723.1 from the ITU-T. 

In the CELP type speech coder/speech decoder 
equipped with the above-described algebraic 
excitation in a random codebook section, however, a 
target for a random codebook search is always coded 
with a pulse sequence vector, which puts a limit to 
improvement on. speech quality. 
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Disclosure of Invention 

It is therefore a primary object of the present 
invention to provide an excitation vector generator, 
a speech coder and a speech decoder, which can 
significantly suppress the memory capacity as 
compared with a case where random code vectors are 
stored directly in a random codebook, and can improve 
the speech quality 

It is a secondary object of this invention to 
provide an excitation vector generator, a speech 
coder and a speech decoder, which can generate 
complicated random code vectors as compared with a 
case where an algebraic excitation is provided in a 
random codebook section and a target for a random 
codebook search is coded with a pulse sequence vector, 
and can improve the speech quality. 

In this invention, the fixed code vector reading 
section and fixed codebook of a conventional CELP 
type speech coder /decoder are respectively replaced 
with an oscillator, which outputs different vector 
sequences in accordance with the values of input 
seeds, and a seed storage section which stores a 
plurality of seeds (seeds of the oscillator). This 
eliminates the need for fixed code vectors to be 
stored directly in a fixed codebook (ROM) and can 
thus reduce the memory capacity significantly. 

Further, according to this invention, the random 



code vector reading section and random codebook of 
the conventional CELP type speech coder /decoder are 
respectively replaced with an oscillator and a seed 
storage section. This eliminates the need for random 
code vectors to be stored directly in a random 
codebook (ROM) and can thus reduce the memory 
capacity significantly. 

The invention is an excitation vector generator 
which is so designed as to store a plurality of fixed 
waveforms, arrange the individual fixed waveforms at 
respective start positions based on start position 
candidate information and add those fixed waveforms 
to generate an excitation vector. This can permit an 
excitation vector close to an actual speech to be 
generated. 

Further, the invention is a CELP type speech 
coder/decoder constructed by using the above 
excitation vector generator as a random codebook. A 
fixed waveform arranging section may algebraically 
generate start position candidate information of 
fixed waveforms. 

Furthermore, the invention is a CELP type speech 
coder/decoder, which stores a plurality of fixed 
waveforms, generates an impulse with respect to start 
position candidate information of each fixed waveform, 
convolutes the impulse response of a synthesis filter 
and each fixed waveform to generate an impulse 
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response for each fixed waveform, computes the 
autocorrelations and correlations of impulse 
responses of the individual fixed waveforms and 
develop them in a correlation matrix. This can 
provide a speech coder/decoder which improves the 
quality of a synthesized speech at about the same 
computation cost as needed in a case of using an 
algebraic excitation as a random codebook. 

Moreover, this invention is a CELP type speech 
coder/decoder equipped with a plurality of random 
codebooks and switch means for selecting one of the 
random codebooks. At least one random codebook may 
be the aforementioned excitation vector generator, or 
at least one random codebook may be a vector storage 
section having a plurality of random number sequences 
stored therein or a pulse sequences storage section 
having a plurality of random number sequences stored 
therein, or at least two random codebooks each having 
the aforementioned excitation vector generator may be 
provided with the number of fixed waveforms to be 
stored differing from one random codebook to another, 
and the switch means selects one of the random 
codebooks so as to minimize coding distortion at the 
time of searching a random codebook or adaptively 
selects one random codebook according to the result 
of analysis of speech segments. 
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Brief Description of Drawings 

FIG. 1 is a schematic diagram of a conventional 
CELP type speech coder; 

FIG. 2A is a block diagram of an excitation 
5 vector generating section in the speech coder in FIG. 

1; 

FIG. 2B is a block diagram of a modification of 
the excitation vector generating section which is 
designed to reduce the computation cost; 

10 fig. 2C is a block diagram of an excitation 

vector generating section in a speech decoder which 
is used as a pair with the speech coder in FIG. 1; 

FIG. 3 is a block diagram of the essential . 
portions of a speech coder according to a first mode; 

15 FIG. 4 is a block diagram of an excitation vector 

generator equipped in the speech coder of the first 
mode ; 

FIG. 5 is a block diagram of the essential 
portions of a speech coder according to a second 
20 mode; 

FIG. 6 is a block diagram of an excitation vector 
generator equipped in the speech coder of the second 
mode; 

FIG. 7 is a block diagram of the essential 
25 portions of a speech coder according to third and 

fourth modes; 

FIG. 8 is a block diagram of an excitation vector 
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generator equipped in the speech coder of the third 
mode; 

FIG. 9 is a block diagram of a non-linear digital 
filter equipped in the speech coder of the fourth 
mode; 

FIG. 10 is a diagram of the adder characteristic 
of the non-linear digital filter shown in FIG. 9; 

FIG. 11 is a block diagram of the essential 
portions of a speech coder according to. a fifth mode; 

FIG. 12 is a block diagram of the essential 
portions of a speech coder according to a sixth mode; 

FIG. 13A is a block diagram of the essential 
portions of a speech coder according to a .seventh 
mode; 

FIG. 13B is a block diagram of the essential 
portions of the speech coder according to the seventh 
mode ; 

FIG. 14 is a block diagram of the essential 
portions of a speech decoder according to an eighth 
mode; 

FIG. 15 is a block diagram of the essential 
portions of a speech coder according to a ninth mode; 

FIG. 16 is a block diagram of a quantization 
target LSP adding section equipped in the speech 
coder according. to the ninth mode; 

FIG. 17 is a block diagram of an LSP 
quantizing/decoding section equipped in the. speech 



13 

coder according to the ninth mode; 

FIG. 18 is a block diagram of the essential 
portions of a speech coder according to a tenth mode; 

FIG. 19A is a block diagram of the essential 
portions of a speech coder according to an eleventh 
mode ; 

FIG. 19B is a block diagram of the essential 
portions of a speech decoder according to the 
eleventh mode; 

FIG. 20 is a block diagram of the essential 
portions of a speech coder according to a twelfth 
mode; 

FIG. 21 is a block diagram of the essential 
portions of a speech coder according to a thirteenth 
mode; 

FIG. 22 is a block diagram of the essential 
portions of a speech coder according to a fourteenth 
mode; 

FIG. 23 is a block diagram of the essential 
portions of a speech coder according to a fifteenth 
mode; 

FIG. 24 is a block diagram of the essential 
portions of a speech coder according to a sixteenth 
mode; 

FIG. 25 is a block diagram of a vector quantizing 
section in the sixteenth mode; 

FIG. 26 is a block diagram of a parameter coding 
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section of a speech coder according to a seventeenth 
mode; and 

FIG. 27 is a block diagram of a noise canceler 
according to an eighteenth mode. 

5 

Best Modes for Carrying Out the Invention 

Preferred modes of the present invention will now 
be described specifically with reference to the 
accompanying drawings. 
10 (First Mode) 

FIG. 3 is a block diagram of the essential 
portions of a speech coder according to this mode. 
This speech coder comprises an excitation vector 
generator 30, which has a seed storage section- 31 and 
15 an oscillator 32. and an LPC synthesis filter 33. 

Seeds (oscillation seeds) 34 output from the seed 
storage section 31 are input to the oscillator 32. 
The oscillator 32 outputs different vector sequences 
according to the values of the input seeds. The 
20 oscillator 32 oscillates with the content according 
to the value of the seed (oscillation seed) 34 and 
outputs an excitation vector 35 as a vector sequence. 
The LPC synthesis filter 33 is supplied with vocal 
tract information in the form of the impulse response 
25 convolution matrix of the synthesis filter, and 

performs convolution on the excitation vector 35 with 
the impulse response, yielding a synthesized speech 
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36. The Impulse response convolution of the 
excitation vector 35 is called LPC synthesis. 

FIG. 4 shows the specific structure the 
excitation vector generator 30. A seed to be read 
from the seed storage section 31 is switched by a 
control switch 41 for the seed storage section in 
accordance with a control signal given from a 
distortion calculator. 

Simple storing of a plurality of seeds for 
outputting different vector sequences from the 
oscillator 32 in the seed storage section 31 can 
allow more random code vectors to be generated with 
less capacity as compared with a case where 
complicated random code vectors are directly stored 
in a random codebook. 

Although this mode has been described as a speech 
coder, the excitation vector generator 30 can be 
adapted to a speech decoder. In this case, the 
speech decoder has a seed storage section with the 
same contents as those of the seed storage section 31 
of the speech coder and the control switch 41 for the 
seed storage section is supplied with a seed number 
selected at the time of coding. 
(Second Mode) 

FIG. 5 is a block diagram of the essential 
portions of a speech coder according to this mode. 
This speech coder comprises an excitation vector 
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generator 50. which has a seed storage section 51 and 
a non-linear oscillator 52. and an LPC synthesis 
filter 53. 

Seeds (oscillation seeds) 54 output from the seed 
5 storage section 51 are input to the non-linear 

oscillator 52. An excitation vector 55 as a vector 
sequence output from the non-linear oscillator 52 is 
input to the LPC synthesis filter 53. The output of 
the LPC synthesis filter 53 is a synthesized speech 
10 56. 

The non-linear oscillator 52 outputs different 
vector sequences according to the values of the input 
seeds 54, and the LPC synthesis filter 53 performs 
LPC synthesis on the input excitation vector 55 to 
15 output the synthesized speech 56. 

FIG. 6 shows the functional blocks of the 
excitation vector generator 50. A seed to be read 
from the seed storage section 51 is switched by a 
control switch 41 for the seed storage section in 
20 accordance with a control signal given from a 
distortion calculator. 

The use of the non-linear oscillator 52 as an 
oscillator in the excitation vector 50 can suppress 
divergence with oscillation according to the non- 
25 linear characteristic, and can provide practical 
excitation vectors. 

Although this mode has been described as a speech 
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coder, the excitation vector generator 50 can be 
adapted to a speech decoder. In this case, the 
speech decoder has a seed storage section with the 
same contents as those of the seed storage section 51 
of the speech coder and the control switch 41 for the 
seed storage section is supplied with a seed number 
selected at the time of coding. 
(Third Mode) 

FIG. 7 is a block diagram of the essential 
portions of a speech coder according to this mode. 
This speech coder comprises an excitation vector 
generator 70, which has a seed storage section 71 and 
a non-linear digital filter 72. and an LPC synthesis 
filter 73. In the diagram, numeral " 74" denotes a 
seed (oscillation seed) which' is output from the seed 
storage section 71 and input to the non-linear 
digital filter 72, numeral "75" is an excitation 
vector as a vector sequence output from the. non- 
linear digital filter 72, and numeral "76" is a 
synthesized speech output from the LPC synthesis 
filter 73. 

The excitation vector generator 70 has a control 
switch 41 for the seed storage section which switches 
a seed to be read from the seed storage section 71 in 
accordance with a control signal given from a 
distortion calculator, as shown in FIG. 8. 

The non-linear digital filter 72 outputs 
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different vector sequences according to the values of 
the input seeds, and the LPC synthesis filter 73 
performs LPC synthesis on the input excitation vector 
75 to output the synthesized speech 76. 
5 The use of the non-linear digital filter 72 as an 

oscillator in the excitation vector 70 can suppress 
divergence with oscillation according -to the non- 
linear characteristic, and can provide practical 
excitation vectors. Although this mode has been 
10 described as a speech coder, the excitation vector 

generator 70 can be adapted to a speech decoder. In 
this case, the speech decoder has a seed storage 
section with the same contents as those of the seed 
storage section 71 of the speech coder and the 
15 control switch 41 for the seed storage section is 

supplied with a seed number selected at the time of 

coding. 

(Fourth Mode) 

A speech coder according to this mode comprises 
20 an excitation vector generator 70, which has a seed 

storage section 71 and a non-linear digital filter 72, 
and an LPC synthesis filter 73, as shown in FIG. 7. 

Particularly, the non-linear digital filter 72 
has a structure as depicted in FIG. 9. This non- 
25 linear digital filter 72 includes an adder 91 having 

a non-linear adder characteristic as shown in FIG. 10, 
filter state holding sections 92 to 93 capable of 
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retaining the states (the values of y(k-l) to y(k-N)) 
of the digital filter, and multipliers 94 to 95, 
which are connected in parallel to the outputs of the 
respective filter state holding sections 92-93, 
multiply filter states by gains and output the 
results to the adder 91. The initial values of the 
filter states are set in the filter state holding 
sections 92-93 by seeds read from the seed storage 
section 71. The values of the gains of the 
multipliers 94-95 are so fixed that the polarity of 
the digital filter lies outside a unit circle on a Z 
plane. 

FIG. 10 is a conceptual diagram of the non-linear 
adder characteristic of the adder 91 equipped in the 
non-linear digital filter 72, and shows the 
input/output relation of the adder 91 which has a 2's 
complement characteristic. The adder 91 first 
acquires- the sum of adder inputs or the sum of the 
input values to the adder 91, and then uses the non- 
linear characteristic illustrated in FIG. 10 to 
compute an adder output corresponding to the input 
sum . 

In particular, the non-linear digital filter 72 
is a second-order all-pole model so that the two 
filter state holding sections 92 and 93 are connected 
in series, and the multipliers 94 and 95 are 
connected to the outputs of the filter state holding 
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sections 92 and 93. Further, the digital filter in 
which the non-linear adder characteristic of the 
adder 91 is a 2* s complement characteristic is used. 
Furthermore, the seed storage section 71 retains seed 
vectors of 32 words as particularly described in. 
Table 1. 

Table 1: Seed vectors for generating random code 
vectors 



i 


Sy(n-l)[i] 


Sy(n-2)fi] 


i 


Sy(n-l)m 


Sy(n-2)[i] 


1 


0.250000 


0.250000 


9 


0.109521 


-0.761210 


2 


-0.564643 


-0.104927 


10 


-0.202115 


0.198718 


3 


0.173879 


-0.978792 


11 


-0.095041 


0.863849 


4 


0.632652 


0.951133 


12 


-0.634213 


0.424549 


5 


0.920360 


-0.113881 


13 


0.948225 


-0.184861 


6 


0.864873 


-0.860368 


14 


-0.958269 


0.969458 


7 


0.732227 


0.497037 


15 


0.233709 


-0.057248 


8 


0.917543 


-0.035103 


16 


-0.852085 


-0.564948 



10 In the thus constituted speech coder, seed 

vectors read from the seed storage section 71 are 
given as initial values to the filter state holding 
sections 92 and 93 of the non-linear digital filter 
72. Every time zero is input to the adder 91 from an 

15 input vector (zero sequences), the non-linear digital 
filter 72 outputs one sample (y(k) ) at a time which 
is sequentially transferred as a filter state to the 
filter state holding sections 92 and 93. At this 
time, the multipliers 94 and 95 multiply the filter 

20 states output from the filter state holding sections 
92 and 93 by gains al and a2 respectively. The adder 
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91 adds the outputs of the multipliers 94 and 95 to 
acquire the sum of the adder inputs, and generates an 
adder output which is suppressed between +1 to -1 
based on the characteristic in FIG. 10. This adder 
5 output (y(k+l)) is output as an excitation vector and 
is sequentially transferred to the filter state 
holding sections 92 and 93 to produce a new sample 
(y(k+2)). 

Since the coefficients 1 to N of the multipliers 
10 94-95 are fixed so that particularly the poles of the 
non-linear digital filter lies outside a unit circle 
on the Z plane according to this mode, thereby 
providing the adder 91 with a non-linear adder 
characteristic, the divergence of the output can be 
15 suppressed even when' the input to the non-linear 
digital filter 72 becomes large, and excitation 
vectors good for practical use can be kept generated. 
Further, the randomness of excitation vectors to be 
generated can be secured. 
20 Although this mode has been described as a speech 

coder, the excitation vector generator 70 can be 
adapted to a speech decoder. In this case, the 
speech decoder has a seed storage section with the 
same contents as those of the seed storage section 71 
25 of the speech coder and the control switch 41 for the 
seed storage section is supplied with a seed number 
selected at the time of coding. 
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(Fifth Mode) 

FIG. 11 is a block diagram of the essential 
portions of a speech coder according to this mode. 
This speech coder comprises an excitation vector 
generator 110, which has an excitation vector storage 
section 111 and an added- excitation- vector generator 
112, and an LPC synthesis filter 113. 

The excitation vector storage section 111 retains 
old excitation vectors which are read by a control 
switch upon reception of a control signal from an 
unillustrated distortion calculator. 

The added-excitation-vector generator 112 
performs a predetermined process, indicated by an 
added-excitation-vector number excitation vector, on 
an old excitation' vector read from the storage 
section 111 to produce a new excitation vector. The 
added-excitation-vector generator 112 has a function 
of switching the process content for an old 
excitation vector in accordance with the added- 
excitation-vector number. 

According to the thus constituted speech coder, 
an added-excitation-vector number is given from the 
distortion calculator which is executing, for example, 
an excitation vector search. The added-excitation- 
vector generator 112 executes different processes on 
old excitation vectors depending on the value of the 
input added-excitation-vector number to generate 
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different added excitation vectors, and the LPC 
synthesis filter 113 performs LPC synthesis on the 
input excitation vector to output a synthesized 
speech. 

5 According to this mode, random excitation vectors 

can be generated simply by storing fewer old 
excitation vectors in the excitation vector storage 
section 111 and switching the process contents by 
means of the added-excitation-vector generator 112, 
10 and it is unnecessary to store random code vectors 
directly in a random codebook (ROM). This can 
significantly reduce the memory capacity. 

Although this mode has been described as a speech 
coder, the excitation vector generator 110 can be 
15 adapted to a speech decoder. In this case, the 
speech decoder has an excitation vector storage 
section with the same contents as those of the 
• excitation vector storage section 111 of the speech 
coder and an added-excitation-vector number selected 
20 at the time of coding is given to the added- 
excitation-vector generator 112. 

(Sixth Mode) 

FIG. 12 shows the functional blocks of an 
excitation vector generator according to this mode. 
25 This excitation vector generator comprises an added- 
excitation-vector generator 120 and an excitation 
vector storage section 121 where a plurality of 
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element vectors 1 to N are stored. 

The added-excitation-vector generator 120 
includes a reading section 122 which performs a 
process of reading a plurality of element vectors of 
different lengths from different positions in the 
excitation vector storage section 121, a reversing 
section 123 which performs a process of sorting the 
read element vectors in the reverse order, a 
multiplying section 124 which performs a process of 
multiplying a plurality of vectors after the reverse 
process by different gains respectively, a decimating 
section 125 which performs a process of shortening 
the vector lengths of a plurality of vectors after 
the multiplication, an interpolating section 126 
which performs a process of lengthening the vector 
lengths of the thinned vectors, an adding section 127 
which performs a process of adding the interpolated 
vectors, and a process determining/instructing 
section 128 which has a function of determining a 
specific processing scheme according to the value of 
the input added-excitation-vector number and 
instructing the individual sections and a function of 
holding a conversion map (Table 2) between numbers 
and processes which is referred to at the time of 
determining the specific process contents. 
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Table 2: Conversion map between numbers and 



processes 



Bit stream(MS...LSB) 


6 


5 


4 


3 


2 


1 


0 


VI reading position 
(16 kinds) 








3 


2 


1 


0 


V2 reading position 
(32 kinds) 


2 


1 


0 






4 


3 


V3 reading position 
(32 kinds) 


4 


3 


2 


1 


0 






Reverse process 
( 2kinds ) 














0 


Multiplication 
(4 kinds) 


1 


0 












decimating process 
(4 kinds) 








1 


0 






interpolation 
(2 kinds) 






0 











The added-excitation-vector generator 120 will 



now be described more specifically. The added- 
excitation-vector generator 120 determines specific 
processing schemes for the reading section 122, the 
reversing section 123, the multiplying section 124, 
the decimating section 125, the interpolating section 
126 and the adding section 127 by comparing the input 
added-excitation-vector number (which is a sequence 
of 7 bits taking any integer value from 0 to 127) 
with the conversion map between numbers and processes 
(Table 2), and reports the specific processing 
schemes to the respective sections. 

The reading section 122 first extracts an element 
vector 1 (VI) of a length of 100 from one end of the 
excitation vector storage section 121 to the position 
of nl, paying attention to a sequence of the lower 
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four bits of the input added-excitation- vector number 
(nl: an integer value from 0 to 15). Then, the 
reading section 122 extracts an element vector 2 (V2) 
of a length of 78 from the end of the excitation 
vector storage section 121 to the position of n2+14 
(an integer value from 14 to 45), paying attention to 
a sequence of five bits (n2: an integer value from 14 
to 45) having the lower two bits and the upper three 
bits of the input added-excitation-vector number 
linked together. Further, the reading section 122 
performs a process of extracting an element vector 3 
(V3) of a length of Ns (= 52) from one end. of the 
excitation vector storage section 121 to the position 
of n3+46 (an integer value from 46 to 77), paying 
attention to a sequence of the upper five bits of the 
input added-excitation-vector number (n3: an integer 
value from 0 to 31), and sending VI, V2 and V3 to the 
reversing section 123 . 

The reversing section 123 performs a process of 
sending a vector having VI, V2 and V3 rearranged in 
the reverse order to the multiplying section 124 as 
new VI, V2 and V3 when the least significant bit of 
the added-excitation-vector number is "0" and sending 
VI, V2 and V3 as they are to the multiplying section 
124 when the least significant bit is "1." 

Paying attention to a sequence of two bits having 
the upper seventh and sixth bits of the added- 
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excitation-vector number linked, the multiplying 
section 124 multiplies the amplitude of V2 by -2 when 
the bit sequence is "00," multiplies the amplitude of 
V3 by -2 when the bit sequence is "01 , " multiplies 
the amplitude of VI by -2 when the bit sequence is 
"10" or multiplies the amplitude of V2 by 2 when the 
bit sequence is "11/ and sends the result as new VI, 
V2 and V3 to the decimating section 125.. 

Paying attention to a sequence of two bits having 
the upper fourth and third bits of the added- 
excitation-vector number linked, the decimating 
section 125 

(a) sends vectors of 26 samples extracted every other 
sample from VI, V2 and V3 as new VI, V2 and V3 to the 
interpolating section 126 when the bit sequence is 
"00," (b) sends vectors of 26 samples extracted every 
other sample from VI and V3 and every third sample 
from V2 as new VI, V3 and V2 to the interpolating . 
section 126 when the bit, sequence is "01," 

(c) sends vectors of 26 samples extracted every 
fourth sample from VI and every other sample from V2 
and V3 as new VI, V2 and V3 to the interpolating 
section 126 when the bit sequence is "10," and 

(d) sends vectors of 26 samples extracted every 
fourth sample from VI, every third sample from V2 and 
every other sample from V3 as new VI, V2 and V3 to 
the interpolating section 126 when the bit sequence 
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is "11." 

Paying attention to the upper third bit of the 
added-excitation-vector number, the interpolating 
section 126 

(a) sends vectors which have VI, V2 and V3 
respectively substituted in even samples of zero 
vectors of a length Ns (= 52) as new VI, V2 and V3 to 
the adding section 127 when the value of the third 
bit is "0" and 

(b) sends vectors which have VI, V2 and V3 
respectively substituted in odd samples of zero 
vectors of a length Ns (= 52) as new VI, V2 and V3 to 
the adding section 127 when the value of the third 
bit is "1. " 

The adding section 127 adds the three vectors (VI, 
V2 and V3) produced by the interpolating section 126 
to generate an added excitation vector. 

According to this mode, as apparent from the 
above, a plurality of processes are combined at 
random in accordance with the added-excitation-vector 
number to produce random excitation vectors, so that 
it is unnecessary to store random code vectors as 
they are in a random codebook ( ROM) , ensuring a 
significant reduction in memory capacity. 

Note that the use of the excitation vector 
generator of this mode in the speech coder of the 
fifth mode can allow complicated and random 
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excitation vectors to be generated without using a 
large -capacity random codebook. 
(Seventh Mode) 

A description will now be given of a seventh mode 
in which the excitation vector generator of any one 
of the above-described first to sixth modes is used 
in a CELP type speech coder that is based on the PSI- 
CELP, the standard speech coding/decoding system for 
PDC digital portable telephones in Japan. 

FIG. 13A is presents a block diagram of a speech 
coder according to the seventh mode. In this speech 
coder, digital input speech data 1300 is supplied to 
a buffer 1301 frame by frame (frame length Nf = 104). 
At this time, old data in the buffer 1301 is updated 
with new data supplied. A frame power 
quantizing/decoding section 1302 first reads a 
processing frame s(i) (0 ^ i ^ Nf-1) of a length Nf 
(= 104) from the buffer 1301 and acquires mean power 
amp of samples in that processing frame from an 
equation 5 . 



where amp: mean power of samples in a processing 
frame 

i: element number (0 ^ i ^ Nf-1) in the 
processing frame 




2 



Nf 
i = 0 



s 2 (i) 



(5) 



30 

s(i): samples in the processing frame 
Nf : processing frame length (= 52). 
The acquired mean power amp of samples in the 
processing frame is converted to a logarithmically 
5 converted value amplog from an equation 6. 

P B log 10 (255 + l) 
where amplog: logarithmically converted value of the 
mean power of samples in the processing frame 

amp: mean power of samples in the processing 

10 frame. 

The acquired amplog is subjected to scalar 
quantization using a scalar-quantization table Cpow 
of 10 words as shown in Table 3 stored in a power 
quantization table storage section 1303 to acquire an 

15 index of power Ipow of four bits, decoded frame power 
spow is obtained from the acquired index of power 
Ipow. and the index of power Ipow and decoded frame 
power spow are supplied to a parameter coding section 
1331. The power quantization table storage section 

20 1303 is holding a power scalar-quantization table 

(Table 3) of 16 words, which is referred to when the 
frame power quantizing/decoding section 1302 carries 
out scalar quantization of the logarithmically 
converted value of the mean power of the samples in 

25 the processing frame. 
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Table 3: Power scalar-quantization table 



i 


Cpow(i) 


i 


Cpow(i) 


1 


0.00675 


9 


0.39247 


2 


0.06217 


10 


0.42920 


3 


0.10877 


11 


0.46252 


4 


0.16637 


12 


0.49503 


5 


0.21876 


13 


0.52784 


6 


0.26123 


14 


0.56484 


7 


0.30799 


15 


0.61125 


8 


0.35228 


16 


0.67498 



An LPC analyzing section 1304 first reads 
analysis segment data of an analysis segment length 
Nw (= 256) from the buffer 1301, multiplies the read 
5 analysis segment data by a Hamming window of a window 
length Nw (= 256) to yield a Hamming windowed 
analysis data and acquires the autocorrelation 
function of the obtained Hamming windowed analysis 
data to a prediction order Np (= 10). The .obtained 

10 autocorrelation function is multiplied by a lag 

window table (Table 4) of 10 words stored in a lag 
window storage section 1305 to acquire a Hamming 
windowed autocorrelation function, performs linear 
predictive analysis on the obtained Hamming windowed 

15 autocorrelation function to compute an LPC parameter 
a(i) (l £ i £ Np) and outputs the parameter to a 
pitch pre-selector 1308. 



20 
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Table 4: Lag window table 



i 


Wlag(i) 


i 


Wlag(i) 


0 


0.9994438 


5 


0.9801714 


1 


0.9977772 


6 


0.9731081 


2 


0.9950056 


7 


0.9650213 


3 


0.9911382 


8 


0.9559375 


4 


0.9861880 


9 


0.9458861 



Next, the obtained LPC parameter a(i) is 
converted to an LSP (Linear Spectrum Pair) o> ( i ) (1 ^ 
i ^ Np) which is in turn output to an LSP 
quantizing/decoding section 1306. The lag window 
storage section 1305 is holding a lag window table to 
which the LPC analyzing section refers. 

The LSP quantizing/decoding section 1306 first 
refers to a vector quantization table of an LSP 
stored in a LSP quantization table storage section 
1307 to perform vector quantization on the LSP 
received from the LPC analyzing section 1304, thereby 
selecting an optimal index, and sends the selected 
index as an LSP code lisp to the parameter coding 
section 1331. Then, a centroid corresponding to the 
LSP code is read as a decoded LSP coq(i) (1 £ i £ Np) 
from the LSP quantization table storage section 1307, 
and the read decoded LSP is sent to an LSP 
interpolation section 1311. Further, the decoded LSP 
is converted to an LPC to acquire a decoded LSP Oq(i) 
(1 ^ i ^ Np). which is in turn sent to a spectral 
weighting filter coefficients calculator 1312 and a 
perceptual weighted LPC synthesis filter coefficients 
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calculator 1314. The LSP quantization table storage 
section 1307 is holding an LSP vector quantization 
table to which the LSP quantizing/decoding section 
1306 refers when performing vector quantization on an 
LSP. 

The pitch pre-selector 1308 first subjects the 
processing frame data s(i) (0 ^ i ^ Nf-1) read from 
the buffer 1301 to inverse filtering using the LPC a 
(i) (1 £t i £ Np) received from the LPC analyzing 
section 1304 to obtain a linear predictive residual 
signal res(i) (0 ^ i ^ Nf-1), computes the power of 
the obtained linear predictive residual signal res(i), 
acquires a normalized predictive residual power resid 
resulting from normalization of the power of the 
computed residual signal with the power of speech 
samples of a processing subframe, and sends the 
normalized predictive residual power to the parameter 
coding section 13311 ' Next, the linear predictive 
residual signal res(i) is multiplied by a Hamming 
window of a length Nw (= 256) to produce a. Hamming 
windowed linear predictive residual signal resw(i) (0 
^ i ^ Nw-1), and an autocorrelation function <£int(i) 
of the produced resw(i) is obtained over a range of 
Lmin-2 ^ i ^ Lmax+2 (where Lmin is 16 in the 
shortest analysis segment of a long predictive 
coefficient and Lmax is 128 in the longest analysis 
segment of a long predictive coefficient). A 
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polyphase filter coefficient Cppf (Table 5) of 28 
words stored in a polyphase coefficients storage 
section 1309 is convoluted in the obtained 
autocorrelation function <*>int(i) to acquire an 
autocorrelation function <*>dq(i) at a fractional 
position shifted by -1/4 from an integer lag int, an 
autocorrelation function 4>aq(i) at a fractional 
position shifted by +1/4 from the integer lag int, 
and an autocorrelation function <£ah(i) at a 
fractional position shifted by +1/2 from the integer 
lag int . 



Table 5: Polyphase filter coefficients Cppf 



i 


Cppf(i) 


i 


Cppf(i) 


i 


Cppf(i) 


i 


Cppf(i) 


0 


0.100035 


7. 


0.000000 


14 


-0.128617 


21 


-0.212207 


1 


-0.180063 


8 


0.000000 


15 


0.300105 


22 


0.636620 


2 


0.900316 


9 


1.000000 


16 


0.900316 


23 


0.636620 


3 


0.300105 


10 


0.000000 


17 


-0.180063 


24 


-0.212207 


4 


-0.128617 


11 


0.000000 


18 


0.100035 


25 


0.127324 


5 


0.081847 


12 


0.000000 


19 


-0.069255 


26 


-0.090946 


6 


-0.060021 


13 


0.000000 


20 


0.052960 


27 


0. 070736 



Further, for each argument i in a range of Lmin- 
^ i ^ Lmax+2, a process of an equation 7 of. 
substituting the largest one of $int(i), <*>dq(i), <t> 
aq(i) and *ah(i) in <*>max(i) to acquire (Lmax - Lmin 
+ 1) pieces of *max(i) . 
^max(i) = MAX(0int(i),^dq(i),0aq(i),0ah(i)) 

0 max(i) : maximum value of 0int(i),0dq(i),0aq(i),0ah(i) (7) 
where 4>max(i) : the maximum value among <f>int(i), <t> 
dq(i), <f>aq(i), <f>ah(i) 
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I: analysis segment of a long predictive 
coefficient (Lmin 2s i =s Lmax) 

Lmin: shortest analysis segment (= 16) of the 
long predictive coefficient 

Lmax: longest analysis segment (=128) of the 
long predictive coefficient 

<*>int(i): autocorrelation function of an integer 
lag (int) of a predictive residual signal 

<f>dq(i): autocorrelation function of a fractional 
lag (int-1/4) of the predictive residual signal 

$aq(i): autocorrelation function of a fractional 
lag (int+1/4) of the predictive residual signal 

*ah(i): autocorrelation function of a fractional 
lag (int+1/2) of the predictive residual signal. 
15 Larger top six are selected from the acquire 

(Lmax - Lmin + 1) pieces of <f>max(i) and are saved as 
pitch candidates psel(i) (0 3 i £ 5), and the linear 
predictive residual signal' res ( i) and the first pitch 
candidate psel(0) are sent to a pitch weighting 
filter calculator 1310 and psel(i) (0 ^ i ^ 5) to an 
adaptive code vector generator 1319. 

The polyphase coefficients storage section 1309 
is holding polyphase filter coefficients to be 
referred to when the pitch pre-selector 1308 acquires 
the autocorrelation of the linear predictive residual 
signal to a fractional lag precision and when the 
adaptive code vector generator 1319 produces adaptive 
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code vectors to a fractional precision. 

The pitch weighting filter calculator 1310 
acquires pitch predictive coefficients cov(i) (0 ^ i 
^ 2) of a third order from the linear predictive 
residuals res(i) and the first pitch candidate 
psel(O) obtained by the pitch pre-selector 1308. The 
impulse response of a pitch weighting filter Q(z) is 
obtained from an equation which uses the acquired 
pitch predictive coefficients cov(i) (0 ^ i ^ 2), 
and is sent to the spectral weighting filter 
coefficients calculator 1312 and a perceptual 
weighting filter coefficients calculator 1313. 
2 

Q(z) =1+2 cov ^ x Api xz ~ P sel (°) + 1 " 1 ( 8 } 

where Q(z): transfer function of the pitch weighting 
filter 

cov(i): pitch predictive coefficients ( 0 i =a 

2) 

Api: pitch weighting constant (= 0.4) 
psel(0): first pitch candidate. 

The LSP interpolation section 1311 first acquires 
a decoded interpolated LSP cointp(n.i) (1 £ i £ Np) 
subframe by subframe from an equation 9 which uses a 
decoded LSP a)q(i) for the current processing frame, 
obtained by the LSP quantizing/decoding section 1306, 
and a decoded LSP coqp(i) for a previous processing 
frame which has been acquired and saved earlier. 



37 



10 



20 



co int p(n, i) 



0.4xcuq(i) + 0.6xo)qp(i) n-1 
coq(i) n = 2 



where co intp ( n , j ) : interpolated LSP of the n-th 
subframe 

n: subframe number (= 1,2) 
cuq(i): decoded LSP of a processing frame 
coqp(i): decoded LSP of a previous processing 
frame . 

A decoded interpolated LPC aq(n,i) (1 £ i £ Np) 
is obtained by converting the acquired co intp(n,i) to 
an LPC and the acquired, decoded interpolated LPC a 
q(n,i) (1 ^ i ^ Np ) is sent to the spectral 
weighting filter coefficients calculator 1312 and the 
perceptual weighted LPC synthesis filter coefficients 
15 calculator 1314. 

The spectral weighting filter coefficients 
calculator 1312, which constitutes an MA type 
spectral weighting filter I(z) in an equation 10, 
sends its impulse response to the perceptual 
weighting filter coefficients calculator 1313. 
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Nfir 

I(z)= ^ afir^xz -1 (10) 

where I ( z ) : transfer function of the MA type spectral 

weighting filter 

Nfir: filter order (= 11) of I(z) 

afir(i): filter order ( 1 ^ i ^ Nfir) of I(z). 
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Note that the impulse response afir(i) (1 £ 1 £ 
Nfir) in the equation 10 is an impulse response of an 
ARMA type spectral weighting filter G(z), given by an 
equation 11, cut after Nfir(= 11). 

l + ^ Np a^iJxMa'xz" 1 
5 G(z) = ^ ; ~ (ID 

where G(z): transfer function of the spectral 
weighting filter 

n: subframe number (= 1,2) 

Np: LPC analysis order (=10) 
10 a(n,ij: decoded interpolated LSP of the n-th 

subframe 

A. ma: numerator constant (= 0.9) of G(z) 
Aar: denominator constant (= 0.4) of G(z). 
The perceptual weighting filter coefficients 
15 calculator 1313 first constitutes a perceptual 
weighting filter W(z) which has as an impulse 
response the, result of convolution of the impulse, 
response of the spectral weighting filter I(z) 
received from the spectral weighting filter 
20 coefficients calculator 1312 and the impulse response 
of the pitch weighting filter Q(z) received from the 
pitch weighting filter calculator 1310, and sends the 
impulse response of the constituted perceptual 
weighting filter W(z) to the perceptual weighted LPC 
25 synthesis filter coefficients calculator 1314 and a 
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perceptual weighting section 1315. 

The perceptual weighted LPC synthesis filter 
coefficients calculator 1314 constitutes a perceptual 
weighted LPC synthesis filter H(z) from an equation 
5 12 based on the decoded interpolated LPC aq(n,i) 

received from the LSP interpolation section 1311 and 
the perceptual weighting filter W(z) received from 
the perceptual weighting filter coefficients 
calculator 1313. 

10 H(z)= =j=-i W(z) (12) 

1 + 7 . , aq(n,i)xz 

where H(z): transfer function of the perceptual 
weighted synthesis filter 
Np: LPC analysis order 

aq(n,i): decoded interpolated LPC of the n-th 
15 subframe 

n: subframe number (= 1,2) 

W(z): transfer function of the perceptual 
weighting filter (I(z) and Q(z) cascade-connected). 
The coefficient of the constituted perceptual 
20 weighted LPC synthesis filter H(z) is sent to a 

target vector generator A 1316, a perceptual weighted 
LPC reverse synthesis filter A 1317, a perceptual 
weighted LPC synthesis filter A 1321, a perceptual 
weighted LPC reverse synthesis filter B 1326 and a 
25 perceptual weighted LPC synthesis filter B 1329. 

The perceptual weighting section 1315 inputs a 
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subframe signal read from the buffer 1301 to the 
perceptual weighted LPC synthesis filter H(z) in a 
zero state, and sends its outputs as perceptual 
weighted residuals spw(i) (0 ^ i ^ Ns-1) to the 
target vector generator A 1316. 

The target vector generator A 1316 subtracts a 
zero input response Zres(i) (0 ^ i ^ Ns-1), which is 
an output. when a zero sequence is input to the 
perceptual weighted LPC synthesis filter H(z) 
obtained by the perceptual weighted LPC synthesis 
filter coefficients calculator 1314, from the 
perceptual weighted residuals, spw(i) (0 ^ i ^ Ns-1) 
obtained by the perceptual weighting section 1315, 
and sends the subtraction result to the perceptual 
weighted LPC reverse synthesis filter A 1317 and a 
target vector generator B 1325 as a 
target vector r(i) (0 ^ i ^ Ns-1) for selecting an 
excitation vector. 

The perceptual weighted LPC reverse synthesis 
filter A 1317 sorts the target vectors r(i) (0 ^ i ^ 
Ns-1) received from the target vector generator A 
1316 in a time reverse order, inputs the acquired 
vectors to the perceptual weighted LPC synthesis 
filter H(z) with the initial state of zero, and sorts 
its outputs again in a time reverse order to obtain 
time reverse synthesis rh(k) (0 ^ i ^ Ns-1) of the 
target vector, and sends the vector to a comparator A 
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1322 . 

Stored in an adaptive codebook 1318 are old 
excitation vectors which are referred to when the 
adaptive code vector generator 1319 generates 
adaptive code vectors. The adaptive code vector 
generator 1319 generates Nac pieces of adaptive code 
vectors Pacb(i,k) (0 ^ i ^ Nac-1, 0 ^ k ^ Ns-1, 6 
^ Nac ^ 24) based on six pitch candidates psel(j) (0 
^ j ^ 5) received from the pitch pre-selector 1308, 
and sends the vectors to an adaptive/fixed selector 
1320. Specifically, as shown in Table 6, adaptive 
code vectors are generated for four kinds of 
fractional lag positions per a single integer lag 
position when 16 ^ psel(j) ^ 44, adaptive code 
vectors are generated for two kinds of fractional lag 
positions per a single integer lag position when 46 ^ 
psel(j) ^ 64, and adaptive code vectors are generated 
for integer lag positions when 65 ^ psel(j) ^ 128. 
From this, depending on the' value of psel(j) (0 ^ j 
^ 5), the number of -adaptive code vector candidates 
Nac is 6 at a minimum and 24 at a maximum. 
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Table 6: Total number of adaptive code vectors 
and fixed code vectors 



rFr\4"s»1 nnmhor n "F VfiCtOTS 


255 I 


Number of adaptive code 
vectors 
16 <; psel(i) * 44 

. 45 £ psel(i) £ 64 

65 <; psel(i) s 128 


222 

116 (29 x four kinds of 
fractional lags) 

42 (21 x two kinds of 
fractional lags) 

64 ( 64 x one kind of 
fractional lag) 


Number of fixed code 
vectors 


32(l6x two kinds of codes) 



Adaptive code vectors to a fractional precision 
are generated through an interpolation which 
convolutes the coefficients of the polyphase filter 
stored in the polyphase coefficients storage section 
1309 . 



Interpolation corresponding to the value of 
lagf(i) means interpolation corresponding to an 
integer lag position when lagf(i) = 0, interpolation 
corresponding to a .fractional lag position shifted by 
-1/2 from an integer lag position when lagf(i) = 1. 
interpolation corresponding to a fractional lag 
position shifted by +1/4 from an integer lag position 
when lagf(i) = 2, and interpolation corresponding to 
a fractional lag position shifted by -1/4 from an 
integer lag position when lagf(i) = 3. 

The adaptive/fixed selector 1320 first receives 
adaptive code vectors of the Nac (6 to 24) candidates 
generated by the adaptive code vector generator 1319 
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and sends the vectors to the perceptual weighted LPC 
synthesis filter A 1321 and the comparator A 1322. 

To pre-select the adaptive code vectors Pacb(i.k) 
(0 ^ i ^ Nac-1, 0 ^ k ^ Ns-1, 6 ^ Nac ^ 24) 
generated by the adaptive code vector generator 1319 
to Nacb (= 4) candidates from Nac (6 to 24) 
candidates, the comparator A 1322 first acquires the 
inner products prac(i) of the time reverse 
synthesized vectors rh(k) (0 ^ i ^ Ns-1) of the 
target vector, received from the perceptual weighted 
LPC reverse synthesis filter A 1317, and the adaptive 
code vectors Pacb(i.k) from an equation 13. 
Ns-1 

prac(i)= 2 Pacb ^' k)xrh(k) (13) 
k=0 

where Prac(i): reference value for pre-selection of 
adaptive code vectors 

Nac: the number of adaptive code vector 
candidates after pre-selection (= 6 to 24) 

i: number of an adaptive code vector (0 ^ i ^ 
Nac-1) 

Pacb(i,k): adaptive code vector 
rh(k): time reverse synthesis of the target 
vector r (k) . 

By comparing the obtained inner products Prac(i), 
the top Nacp (= 4) indices when the values of the 
products become large and inner products with the 
indices used as arguments are selected and are 
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respectively saved as indices of adaptive code 
vectors after pre-selection apsel(j) (0 ^ j ^ Nacb- 
1) and reference values after pre-selection of 
•adaptive code vectors prac ( apsel ( j ) ) , and the indices 
of adaptive code vectors after pre-selection apsel(j) 
(0 ^ j ^ Nacb-1) are output to the adaptive /fixed 
selector 1320 . 

The perceptual weighted LPC synthesis filter A 
1321 performs perceptual weighted LPC synthesis on 
adaptive code vectors after pre-selection 
Pacb(absel( j ) ,k) , which have been generated by the 
adaptive code vector generator 1319 and have passed 
the adaptive/fixed selector 1320, to generate 
synthesized adaptive code vectors SYNacb ( apsel ( j ), k ) 
which are in turn sent to the comparator A 1322. 
Then, the comparator A 1322 acquires reference values 
for final-selection of an adaptive code vector 
sacbr(j) from an equation 14 for final-selection on 
the Nacb (= 4) adaptive code vectors after pre- 
selection Pacb(absel( j) ,k) , pre-selected by the 
comparator A 1322 itself. 

sacbr(j)- P™ 2 <»P"l(i)) (14) 

y NS SYNacb 2 (j,k) 

where sacbr(j): reference value for final-selection 
of an adaptive code vector 

prac(): reference values after pre-selection of 
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adaptive code vectors 

apsel(j): indices of adaptive code vectors after 
pre- selection 

k: vector order (0 ^ j ^ Ns-1) 

j: number of the index of a pre-selected adaptive 

code vector (0 ^ j ^ Nacb-1) 

Ns: subframe length (= 52) 

Nacb: the number of pre-selected adaptive code 

vectors ( = 4) 

SYNacb(J,K): synthesized adaptive code vectors. 

The index when the value of the equation 14 
becomes large and the value of the equation 14 with 
the index used as an argument are sent to the 
adaptive/fixed selector 1320 respectively as an index 
of adaptive code vector after final- selection ASEL 
and a reference value after final- selection of an 
adaptive code vector sacbr(ASEL) . 

A fixed codebook 1323 holds Nfc (= 16) candidates 
of vectors to be read by a fixed code vector reading 
section 1324. To pre-select fixed code vectors 
Pfcb(i,k) (0 ^ i ^ Nfc-1, 0 ^ k ^ Ns-1) read by the 
fixed code vector reading section 1324 to Nfcb (= 2) 
candidates from Nfc (= 16) candidates, the comparator 
A 1322 acquires the absolute values lprfc(i)l of the 
inner products of the time reverse synthesized 
vectors rh(k) (0 ^ i ^ Ns-1) of the target vector, 
received from the perceptual weighted LPC reverse 
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synthesis filter A 1317, and the fixed code vectors 
Pfcb(i.k) from an equation 15. 
Ns-1 

|prfc(i)|= J Pfcb(i ' k)xrh(k) (15) 
k-0 

where lprfc(i)l: reference values for pre-selection 

of fixed code vectors 

k: element number of a vector (0 ^ k ^ Ns-1) 
i: number of a fixed code vector (0 ^ i ^ Nfc-1) 
. Nfc: the number of fixed code vectors (= 16) 
Pfcb(i.k): fixed code vectors 

rh(k): time reverse synthesized vectors of the 
target vector rh(k) . 

By comparing the values lprfc(i)l of the equation 
15, the top Nfcb (= 2) indices when the values become 
large and the absolute values of inner products with 
the indices used as arguments are selected and are 
respectively saved as indices of fixed code vectors 
after pre-selection fpsel(j) ( 0 ^ j ^ Nfcb-1) and 
reference values for fixed code vectors after pre- 
selection I prfc ( fpsel ( j ) I , and indices of fixed code 
vectors after pre-selection fpsel(j) (0 ^ j ^ Nfcb- 

1) are output to the adaptive/fixed selector 1320. 
The perceptual weighted LPC synthesis filter A 

1321 performs perceptual weighted LPC synthesis on 

fixed code vectors after pre-selection 

Pfcb(fpsel( j) ,k) which have been read from the fixed 

code vector reading section 1324 and have passed the 
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adaptive/fixed selector 1320, to generate synthesized 
fixed code vectors SYNf cb ( f psel ( j ) , k ) which are in 
turn sent to the comparator A 1322. 

The comparator A 1322 further acquires a 
reference value for final-selection of a fixed code 
vector sfcbr(j) from an equation 16 to finally select 
an optimal fixed code vector from the Nfcb (= 2) 
fixed code vectors after pre-selection 
Pf cb(fpsel( j ) ,k) , pre-selected by the comparator A 
1322 itself. 

sfcbr(i)- \f^\ 2 (16) 
SYNfcb 2 (j,k) 

where sfcbr(j): reference value for final-selection 
of a fixed code vector 

lprfc()l: reference values after pre-selection of 
fixed code vectors 

fpsei(j): indices of fixed code vectors after 
pre-selection (0 ^ j ^ Nfcb-1) 

k: element number of a vector (0 ^ k ^ Ns-1) 

j: number of a pre-selected fixed code vector (0 

^ j ^ Nfcb-1) 

Ns : subframe length (= 52) 

Nfcb: the number of pre-selected fixed code 

vectors (= 2) 

SYNf cb( J,K) : synthesized fixed code vectors. 
The index when the value of the equation 16 
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becomes large and the value of the equation 16 with 
the index used as an argument are sent to the 
adaptive/fixed selector 1320 respectively as an index 
of fixed code vector after final- selection FSEL and a 
reference value after final-selection of a fixed code 
vector sacbr(FSEL). 

The adaptive/fixed selector 1320 selects either 
the adaptive code vector after final- selection or the 
fixed code vector after final- selection as an 
adaptive/fixed code vector AF(k) (0 ^ k ^ Ns-1) in 
accordance with the size relation and the polarity 
relation among prac(ASEL) , sacbr(ASEL), Iprf c(FSEL ) I 
and sfcbr(FSEL) (described in an equation 17) 
received from the comparator A 1322. 

fPacb(ASEL, k) sacbr(ASEL) * sfcbr(FSEL) , prac(ASEL) > 0 
0 sacbr(ASEL) a sfcbr(FSEL) , prac(ASEL) s 0 

Pfcb(FSEL, k) sacbr(ASEL) < sfcbr (FSEL) , prfc(FSEL) a 0 
- Pfcb(FSEL, k) sacbr(ASEL) < sfcbr(FSEL), prfc(FSEL) < 0 

(17) 

where AF(k): adaptive/ fixed code vector 

ASEL: index of adaptive code vector after final- 
selection 

FSEL: index of fixed code vector after final- 
selection 

k: element number of a vector 

Pacb ( ASEL , k) : adaptive code vector after final- 
selection 

Pf cb (FSEL , k ) : fixed code vector after final- 



AF(k) = 
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selection Pfcb(FSEL,k) 

sacbr(ASEL): reference value after final- 
selection of an adaptive code vector 

sfcbr(FSEL): reference value after final- 
5 selection of a fixed code vector 

prac(ASEL): reference values after pre- selection 
of adaptive code vectors 

prfc(FSEL): reference values after pre- selection 
of fixed code vectors prfc(FSEL). 
10 The selected adaptive/fixed code vector AF(k) is 

sent to the perceptual weighted LPC synthesis filter 
A 1321 and an index representing the number that has 
generated the selected adaptive/fixed code vector 
AF(k) is sent as an adaptive/fixed index AFSEL to the 
15 parameter coding section 1331. As the total number 
of adaptive code vectors and fixed code vectors is 
designed to be 255 (see Table 6), the adaptive /fixed 
index AFSEL is a code of 8 bits. 

The perceptual weighted LPC synthesis filter A 
20 1321 performs perceptual weighted LPC synthesis on 

the adaptive/fixed code vector AF ( k ) , selected by the 
adaptive/fixed selector 1320. to generate a 
synthesized adaptive/fixed code vector . SYNaf (k) (0 ^ 
k ^ Ns-1) and sends it to the comparator A 1322. 
25 The comparator A 1322 first obtains the power 

powp of the synthesized adaptive/fixed code vector 
SYNaf(k) (0 S k ^ Ns-1) received from the perceptual 
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weighted LPC synthesis filter A 1321 using an 
equation 18. 
Ns-1 

powp = V SYNar(k) (18) 
k=0 

where powm: power of adaptive/fixed code vector 
(SYNaf(k)) 

k: element number of a vector (0 ^ k ^ Ns-1) 

Ns: subframe length (=52) 

SYNaf(k): adaptive/fixed code vector. 

Then, the inner product pr of the target vector 
received from the target vector generator A 1316 and 
the synthesized adaptive/fixed code vector SYNaf(k) 
is acquired from an equation 19. 
Ns-1 

pr= 5) SYNaf(k)xr(k) (19) 
k=0 

where pr : inner product of SYNaf(k) and r(k) 
Ns: subframe length (= 52) 
SYNaf(k): adaptive/fixed code vector 
r (k) : target vector 

k: element number of a vector ( 0 =a k =s Ns-1). 

Further, the adaptive/fixed code vector AF(k) 
received from the adaptive/fixed selector 1320 is 
sent to an adaptive codebook updating section 1333 t 
compute the power POWaf of AF(k), the synthesized 
adaptive/fixed code vector SYNaf(k) and POWaf are 
sent to the parameter coding section 1331, and powp, 
pr, r(k) and rh(k) are sent to a comparator B 1330. 
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The target vector generator B 1325 subtracts the 
synthesized adaptive/ fixed code vector SYNaf(k), received 
from the comparator A 1322, from the target vector r(i) (0 
^5 i ^ Ns-1) received from the comparator A 1322, to 
5 generate a new target vector, and sends the new target 
vector to the perceptual weighted LPC reverse synthesis 
filter B 1326. 

The perceptual weighted LPC reverse synthesis filter B 
1326 sorts the new target vectors, generated by the target 
10 vector generator B 1325, in a time reverse order, sends the 
sorted vectors to the perceptual weighted LPC synthesis 
filter in a zero state, the output vectors are sorted again 
in a time reverse order to generate time-reversed 
synthesized vectors ph(Jc) (0 ^ k ^ Ns-1) which are in turn 
15 sent to the comparator B 1330. 

An excitation vector generator 1337 in use is the same 
as, for example, the excitation vector generator 70 which 
has been described in the section of the third mode. The 
excitation vector generator 70 generates a random code 
20 vector as the first seed is read from the seed storage 

section 71 and input to the non-linear digital filter 72. 
The random code vector generated by the excitation vector 
generator 70 is sent to the perceptual weighted LPC 
synthesis filter B 1329 and the comparator B 1330. Then, as 
25 the second seed is read from the seed storage section 71 
and input to the non-linear digital filter 72, a random 
code vector is generated and output to the filter B 1329 
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and the comparator B 1330. 

To pre -select random code vectors generated based on 
the first seed to Nstb (= 6) candidates from Nst (= 64) 
candidates, the comparator B 1330 acquires reference values 
5 cr(il) (0 ^ il ^ Nstbl-1) for pre-selection of first random 
code vectors from an equation 20. 

Ns-1 nr Ns-1 

cr(il)- V Pstbl(ilj)xrh(j) — V Pstbl(ilj) x ph(j) (20) 
jt6 powp i=o 

where cr(il): reference values for pre-selection of first 
random code vectors 
10 Ns: subframe length (= 52) 

rh(j): time reverse synthesized vector of a target 

vector (r(j) ) 

powp: power of an adaptive /fixed vector (SYNaf(k)) 
pr: inner product of SYNaf (k) and r(k) 
15 Pstbl(il,j): first random code vector 

ph(j): time reverse synthesized vector of SYNaf (k) 
il: number of the first random code vector (0 ^ il ^ 
Nst-l) 

j: element number of a vector. 

20 By comparing the obtained values cr(il), the top Nstb 

(=6) indices when the values become large and inner 
products with the indices used as arguments are selected 
and are respectively saved as indices of first random code 
vectors after pre-selection slpsel(jl) (0 ^ jl ^ Nstb-1) 

25 and first random code vectors after pre-selection 

Pstbl(slpsel(jl),k) (0 ^ jl ^ Nstb-1. 0 ^ k ^ Ns-1). Then, 
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the same process as done for the first random code vectors 
is performed for second random code vectors and indices and 
inner products are respectively saved as indices of second 
random code vectors after pre-selection slpsel(j2) (0 ^ j2 
5 ^ Nstb-1) and second random code vectors after pre- 
selection Pstb2(s2psel( j2) ,k) (0 ^ j2 ^ Nstb-1, 0 ^ k ^ 
Ns-1). 

The perceptual weighted LPC synthesis filter B 1329 
performs perceptual weighted LPC synthesis on the first 
_ 10 random code vectors after pre-selection Pstbl(slpsel( jl) ,k) 

y3 to generate synthesized first random code vectors 

3 fj 5 

=P SYNstbl(slpsel( jl) ,k) which are in turn sent to the 

ill 

%p comparator B 1330. Then, perceptual weighted LPC synthesis 

yg is performed on the second random code vectors after pre- 

p 15 selection Pstb2 ( slpsel( j2 ) ,k) to generate synthesized 

Tt second random code vectors SYNstb2( s2psel( j2 ) ,k) which are 

S in turn sent to the comparator B 1330. 

^ To implement final- selection on the first random code 

vectors after pre-selection Pstbl (slpsel( jl ) ,k) and the 

20 second random code vectors after pre-selection 

Pstb2(slpsel( j2) ,k) , pre-selected by the comparator B 1330 
itself, the comparator B 1330 carries out the computation 
of an equation 21 on the synthesized first random code 
vectors SYNstbK slpseK jl) ,k) computed in the perceptual 

25 weighted LPC synthesis filter B 1329. 



SYNOstbl(slpsel(jl),k) - SYNstbl(slpsel(jl),k) 

- STOaf W J Ns -Vstbl(slpsel(il).k)>ph(k) 1211 

powp ^ k =* 0 

where SYNOstbK slpsel( jl) ,k) : orthogonally synthesized 
first random code vector 

SYNstbl(slpsel( jl) ,k) : synthesized first random code 
vector 

Pstbl(slpseM.jl) ,k) : first random code vector after 
pre-selection 

SYNaf(j): adaptive /fixed code vector 

powp: power of adaptive/fixed code vector (SYNaf ( j ) ) 

Ns: subframe length (=52) 

ph(k) : time reverse synthesized vector of SYNaf (j) 
jl: number of first random code vector after pre- 
selection 

k: element number of a vector (0 ^ k ^ Ns-1). 

Orthogonally synthesized first random code vectors 
SYNOstbl(slpsel( jl) ,k) are obtained, and a similar 
computation is performed on the synthesized second random 
code vectors SYNstb2(s2psel( j2) ,k) to acquire orthogonally 
synthesized second random code vectors 

SYNOstb2(s2psel( j2) ,k) , and reference values after final- 
selection of a first random code vector slcr and reference 
values after final- selection of a second random code vector 
s2cr are computed in a closed loop respectively using 
equations 22 and 23 for all the combinations (36 
combinations) of (slpsel(jl). s2psel(j2)). 



csc rl 



scrl = 




, k) + SYNOstb2(s2psel(j2),k)P 



k = 0 



(22) 



where scrl: reference value after final-selection of a 
first random code vector 

cscrl: constant previously computed from an equation 

24 

SYNOstbl(slpsel( jl) ,k) : orthogonally synthesized first 
random code vectors 

SYNOstb2(s2psel( J2) ,k) : orthogonally synthesized 
second random code vectors 

r(k): target vector 

slpsel(jl): index of first random code vector after 
pre-selection 

s2psel(j2): index of second random code vector after 
pre-selection 

Ns: subframe length (=52) 

k: element number of a vector. 



where scr2: reference value after final- selection of a 
second random code vector 

cscr2: constant previously computed from an equation 



scr2 = 




(23) 
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SYNOstbl(slpseMjl) ,k) : orthogonally synthesized first 

random code vectors 

SYNOstb2(s2psel( j2) ,k) : orthogonally synthesized 
second random code vectors 
5 r(k): target vector 

slpsel(jl): index of first random code vector after 
pre-selection 

s2psel(j2): index of second random code vector after 
pre-selection 
10 Ns: subframe length (= 52) 

k: element number of a vector. 

Note that cslcr in the equation 22 and cs2cr in the 

equation 23 are constants which have been calculated 

previously using the equations 24 and 25, respectively. 

Ns-1 Ns-1 
15 cscrl= y SYNOstbl(slpsel(jl) ) k)xr(k)^ V SYNOstb2(s2psel(j2),k)xr(k) 

(24) 

where cscrl: constant for an equation 29 

SYNOstbl(slpsel( jl) ,k) : orthogonally synthesized first 
random code vectors 
20 SYNOstb2(s2psel( j2) ,k) : orthogonally synthesized 

second random code vectors 
r(k): target vector 

slpsel(jl): index of first random code vector after 
pre-selection 

25 s2psel(j2): index of second random code vector after 

pre-selection 
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Ns: subframe length ( = 52) 

k: element number of a vector. 

Ns-l Ns-l 
cscrl= J SYNOstbl(slpsel(jl),k)xr(k)- YSYNOstb2(s2psel(j2),k)xr(k) 

(25) 

5 where cscr2 : constant for the equation 23 

SYNOstbl(slpsel(jl) ,k) : orthogonally synthesized first 

random code vectors 

SYNOstb2(s2psel( j2) ,k) : orthogonally synthesized 

second random code vectors 
10 r(k): target vector 

slpsel(jl): index of first random code vector after 
pre-selection 

s2psel(j2): index of second random code vector after 
pre-selection 
15 Ns: subframe length (=52) 

k: element number of a vector. 

The comparator B 1330 substitutes the maximum value of 
Slcr in MAXslcr, substitutes the maximum value of S2cr in 
MAXs2cr, sets MAXslcr or MAXs2cr, whichever is larger, as 

20 scr, and sends the value of slpsel(jl), which had been 

referred to when scr was obtained, to the parameter coding 
section 1331 as an index of a first random code vector 
after final-selection SSEL1. The random code vector that 
corresponds to SSEL1 is saved as a first random code vector 

25 after final-selection Pstbl(SSELl,k) , and is sent to the 
parameter coding section 1331 to acquire a first random 
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code vector after final- selection SYNstbl (SSEL1 ,k) (0 ^ k 
^ Ns-1) corresponding to Pstbl (SSEL1 ,k) . 

Likewise, the value of s2psel(j2), which had been 
referred to when scr was obtained, to the parameter coding 
section 1331 as an index of a second random code vector 
after final-selection SSEL2. The random code vector that 
corresponds to SSEL2 is saved as a second random code 
vector after final-selection Pstb2 ( SSEL2 ,k) , and is sent to 
the parameter coding section 1331 to acquire a second 
random code vector after final- selection SYNstb2(SSEL2,k) 
(0 ^ k ^ Ns-1) corresponding to Pstb2(SSEL2 , k) . 

The comparator B 1330 further acquires codes SI and S2 
by which Pstbl(SSELl,k) and Pstb2(SSEL2,k) are respectively 
multiplied, from an equation 26, and sends polarity 
information Isls2 of the obtained SI and S2 to the 
parameter coding section 1331 as a gain polarity index 
Isls2 (2 -bit information). 



K 9 J | (+1,-1) scrl < scr2, cscr2 * 0 
(-1,+1) scrl < scr2, cscr2 < 0 
where SI: code of the first random code vector after final- 
selection 

S2: code of the second random code vector after final- 
selection 

scrl:. output of the equation 29 
scr2: output of the equation 23 
cscrl: output of the equation 24 



X+1.+D 

(-1-1) 



scrl a scr2, cscrl s= 0 
scrl a scr2, cscrl < 0 



(26) 



cscr2: output of the equation 25. 

A random code vector ST(k) (0 ^ k ^ Ns-1) is generated 
by an equation 27 and output to the adaptive codebook 
updating section 1333, and its power POWsf is acquired and 
output to the parameter coding section 1331. 
ST(k) « Six Pstbl(SSELl, k) + S2 x Pstb2(SSEL2, k) (27). 
where ST(k) : probable code vector 

SI: code of the first random code vector after final- 
selection 

S2: code of the second random code vector after final- 
selection 

Pstbl(SSELl,k) : first-stage settled code vector after 
final- select ion 

Pstbl(SSEL2,k) : second-stage settled code vector after 

final- selection 

SSEL1: index of the first random code vector after 

final- selection 

SSEL2: second random code vector after final-selection 

k: element number of a vector (0 ^ k ^ Ns-1). 

A synthesized random code vector SYNst(k) (0 ^ k ^ Ns- 

1) is generated by an equation 28 and output to the 

parameter coding section 1331. 

SYNst(k) - SI x SYNstbKSSELl, k) + S2 x SYNstb2(SSEL2, k) (28) 
where STNst(k): synthesized probable code vector 

SI: code of the first random code vector after final- 
selection 

S2: code of the second random code vector after final- 



selection 

SYNstbl(SSELl,k) : synthesized first random code vector 
after final-selection 

SYNstb2(SSEL2,k) : synthesized second random code 
vector after final-selection 

k: element number of a vector (0 2* k ^ Ns-1). 

The parameter coding section 1331 first acquires a 
residual power estimation for each subfraxne rs is acquired 
from an equation 29 using the decoded frame power spow 
which has been obtained by the frame power 
quantizing/decoding section 1302 and the normalized 
predictive residual power resid, which has been obtained by 
the pitch pre-selector 1308. 

rs=NsXspowX resid (29) 
where rs : residual power estimation for each subframe 

Ns: subframe length (= 52) 
spow: decoded frame power 

resid: normalized predictive residual power. 

A reference value for quantization gain selection STDg 
is acquired from an equation 30 by using the acquired 
residual power estimation for each subframe rs, the power 
of the adaptive/fixed code vector POWaf computed in the 
comparator A 1322, the power of the random code vector 
POWst computed in the comparator B 1330, a gain 
quantization table (CGaf [i] ,CGst [i] ) (0 ^ i ^ 127) of 256 
words stored in a gain quantization table -storage section 
1332 and the like. 
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Table 7: Gain quantization table 



i 


CGaf(i) 


CGst(i) 


1 


0.38590 


0.23477 


2 


0.42380 


0.50453 


3 


0.23416 


0.24761 








12 6 


0.35382 


1.68987 


12 7 


0.10689 


1.02035 


12 8 


3.09711 


1.75430 



rs 



STDg 



POWaf 



•CGaf(Ig)*SYNaf(k) 



rs 



! POWst 



-CGst{Ig)xSYNst(k)-r(k) 



(30) 

5 where STDg: reference value for quantization gain selection 
rs: residual power estimation for each subf rame . 
POWaf: power of the adaptive /fixed code vector 
POWSst : power of the random code vector 
i: index of the gain quantization table (0 ^ i ^ 127) 
10 CGaf(i): component on the adaptive /fixed code vector 

side in the gain quantization table 

CGst(i): component on the random code vector side in 
the gain quantization table 

SYNaf(k): synthesized adaptive /fixed code vector 
15 SYNst(k): synthesized random code vector 

r(k) : target vector 
Ns: subf rame length (= 52) 

k: element number of a vector (0 ^ k ^ Ns-1). 



One index when the acquired reference value for 
quantization gain selection STDg becomes minimum is 
selected as a gain quantization index Ig, a final gain on 
the adaptive /fixed code vector side Gaf to be actually 
applied to AF(k) and a final gain on the. random code vector 
side Gst to be actually applied to ST(k) are obtained from 
an equation 31 using a gain after selection of the 
adaptive /fixed code vector CGaf(Ig), which is read from the 
gain quantization table based on the selected gain 
quantization index Ig, a gain after selection of the random 
code vector CGst(Ig), which is read from the gain 
quantization table based on the selected gain quantization 
index Ig and so forth, and are sent to the adaptive 
codebook updating section 1333. 

(31) 

where Gaf: final gain on the adaptive /fixed code vector 
side 

Gst: final gain on the random code vector side Gst 
rs: residual power estimation for each subframe 
POWaf : power of the adaptive/fixed code vector 
POWst: power of the random code vector 
CGaf(Ig): power of a fixed/adaptive side code vector 
CGst(Ig): gain after selection of a random code vector 

side 

Ig: gain quantization index. 
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The parameter, coding section 1331 converts the index 
of power Ipow, acquired by the frame power 
quantizing/decoding section 1302, the LSP code lisp, 
acquired by the LSP quantizing/ decoding section 1306, the 
5 adaptive/fixed index AFSEL , acquired by the adaptive/fixed 
selector 1320. the index of the first random code vector 
after final-selection SSEL1, the second random code vector 
after final -select ion SSEL2 and the polarity information 
Isls2, acquired by the comparator B 1330, and the gain 

10 quantization index Ig, acquired by the parameter coding 

section 1331, into a speech code, which is in turn sent to 
a transmitter 1334. 

The adaptive codebook updating section 1333 performs a 
process of an equation 32 for multiplying the 

15 adaptive/fixed code vector AF(k), acquired by the 

comparator A 1322, and the random code vector ST(k), 
acquired by the comparator B 1330, respectively by the 
final gain on the adaptive /fixed code vector side Gaf and 
the final gain on the random code vector side Gst, acquired 

20 by the parameter coding section 1331, and then adding the 

results to thereby generate an excitation vector ex(k) (0 ^ 

k ^ Ns-1), and sends the generated excitation vector ex(k) 

(0 ^ k ^ Ns-1) to the adaptive codebook 1318. 

ex(k) = Gaf x AF(k) + Gst x ST(k) 

(32) 

25. where ex(k) : excitation vector 

AF(k): adaptive /fixed code vector 
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ST(k): random code vector 

k: element number of a- vector (0 ^ k ^ Ns-1). 
At this time, an old excitation vector in the adaptive 
codebook 1318 is discarded and is updated with a new 
5 excitation vector ex(k) received from the adaptive codebook 
updating section 1333. 
(Eighth Mode) 

A description will now be given of an eighth mode in 
which any excitation vector generator described in first to 
10 sixth modes is used in a speech decoder that is based on 
the PSI-CELP, the standard speech coding/decoding system 
for PDC digital portable telephones. This decoder makes a 
pair with the above -described seventh mode. 

FIG. 14 presents a functional block diagram of a 
15 speech decoder according to the eighth mode. A parameter 

decoding section 1402 obtains the speech code (the index of 
power Ipow, LSP code lisp, adaptive /fixed index AFSEL , 
index of the first random code vector after final-selection 
SSEL1, second random code vector after final- selection 
20 SSEL2 , gain quantization index Ig and gain polarity index 

Isls2), sent from the CELP type speech coder illustrated in 
FIG. 13, via a transmitter 1401. 

Next, a scalar value indicated by the index of power 
Ipow is read from the power quantization table (see Table 
25 3) stored in a power quantization table storage section 
1405. is sent as decoded frame power spow to a power 
restoring section 1417, and a vector indicated by the LSP 
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code lisp is read from the LSP quantization table an LSP 
quantization table storage section 1404 and is sent as a 
decoded LSP to an LSP interpolation section 1406. The 
adaptive /fixed index AFSEL is sent to an adaptive code 
5 vector generator 1408, a fixed code vector reading section 
1411 and an adaptive/fixed selector 1412, and the index of 
the first random code vector after final- selection SSEL1 
and the second random code vector after final-selection 
SSEL2 are output to an excitation vector generator 1414. 
10 The vector '(CAaf(Ig), CGst(Ig)) indicated by the gain 

quantization index Ig is read from the gain quantization 
table (see Table 7) stored in a gain quantization table 
storage section 1403, the final gain on the final gain on 
the adaptive /fixed code vector side Gaf to be actually 
15 applied to AF(Jc) and the final gain on the random code 
vector side Gst to be actually applied to ST(k) are 
acquired from the equation 31 as done on the coder side, 
and the acquired final gain on the adaptive/ fixed code 
vector side Gaf and final gain on the random code vector 
20 side Gst are output together with the gain polarity index 
Isls2 to an excitation vector generator 1413. 

The LSP interpolation section 1406 obtains a decoded 
interpolated LSP ft)intp(n.i) (1 ^ i ^ Np) subframe by 
subframe from the decoded LSP received from the parameter 
25 decoding section 1402, converts the obtained co intp(n.i) to 
an LPC to acquire a decoded interpolated LPC, and sends the 
decoded interpolated LPC to an LPC synthesis filter 1416. 



The adaptive code vector generator 1408 convolute some 
of polyphase coefficients stored in a polyphase 
coefficients storage section 1409 (see Table 5) on vectors 
read from an adaptive codebook 1407, based on the 
adaptive /fixed index AFSEL received from the parameter 
decoding section 1402, thereby generating adaptive code 
vectors to a fractional precision, and sends the adaptive 
code vectors to the adaptive/fixed selector 1412. The fixed 
code vector reading section 1411 reads fixed code vectors 
from a fixed codebook 1410 based on the adaptive /fixed 
index AFSEL received from the parameter decoding section 
1402, and sends them to the adaptive /fixed selector 1412. 

The adaptive/fixed selector 1412 selects either the 
adaptive code vector input from the adaptive code vector 
generator 1408 or the fixed code vector input from the 
fixed code vector reading section 1411, as the 
adaptive/fixed code vector AF(k), based on the 
adaptive/fixed index AFSEL received from the parameter 
decoding section 1402, and sends the selected 
adaptive/fixed code vector AF(k) to the excitation vector 
generator 1413. The excitation vector generator 1414 
acquires the first seed and second seed from the seed 
storage section 71 based on the index of the first random 
code vector after final-selection SSEL1 and the second 
random code vector after final- selection SSEL2 received 
from the parameter decoding section 1402, and sends the 
seeds to the non-linear digital filter 72 to generate the 




first random code vector and the second random code vector, 
respectively. Those reproduced first random code vector and 
second random code vector are respectively multiplied by 
the first-stage information SI and second-stage information 
S2 of the gain polarity index to generate an excitation 
vector ST(k) , which is sent to the excitation vector 
generator 1413. 

The excitation vector generator 1413 multiplies the 
adaptive/fixed code vector AF(k), received from the 
adaptive/fixed selector 1412, and the excitation vector 
ST(k), received from the excitation vector generator 1414, 
respectively by the final gain on the adaptive/fixed code 
vector side Gaf and the final gain on the random code 
vector side Gst ,. obtained by the parameter decoding section 
1402, performs addition or subtraction based on the gain 
polarity index Isls2, yielding the excitation vector ex(k) , 
and sends the obtained excitation vector to the excitation 
vector generator 1413 and the adaptive codebook 1407. Here, 
an old excitation vector in the adaptive codebook 1407 is 
updated with a new excitation vector input from the 
excitation vector generator 1413. 

The LPC synthesis filter 1416 performs LPC synthesis 
on the excitation vector, generated by the excitation 
vector generator 1413, using the synthesis filter which is 
constituted by the decoded interpolated LPC received from 
the LSP interpolation section 1406, and sends the filter 
output to the power restoring section 1417. The power 
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restoring section 1417 first obtains the mean power of the 
synthesized vector of the excitation vector obtained by the 
LPC synthesis filter 1416, then divides the decoded frame 
power spow, received from the parameter decoding section 
1402, by the acquired mean power, and multiplies the 
synthesized vector of the excitation vector by the division 
result to generate a synthesized speech 518. 
(Ninth Mode) 

FIG. 15 is a block diagram of the essential portions 
of a speech coder according to a ninth mode. This speech 
coder has a quantization target LSP adding section 151, an 
LSP quantizing/decoding section 152, a LSP quantization 
error comparator 153 added to the speech coder shown in 
FIGS. 13 or parts of its functions modified. 

The LPC analyzing section 1304 acquires an LPC by 
performing linear predictive analysis on a processing frame 
in the buffer 1301, converts the acquired LPC to produce a 
quantization target LSP, and sends the produced 
quantization target LSP to the quantization target LSP 
adding section 151. The LPC analyzing section 1304 also hai 
a particular function of performing linear predictive 
analysis on a pre-read area to acquire an LPC for the pre- 
read area, converting the obtained LPC to an LSP for the 
pre-read area, and sending the LSP to the quantization 
target LSP adding section 151. 

The quantization target LSP adding section 151 
produces a plurality of quantization target LSPs in 



addition to the quantization target LSPs directly obtained 
by converting LPCs in a processing frame in the LPC 
analyzing section 1304. 

The LSP quantization table storage section 1307 stores 
the quantization table which is referred to by the LSP 
quantizing/decoding section 152, and the LSP 
quantizing/decoding section 152 quantizes /decodes the 
produced plurality of quantization target LSPs to generate 
decoded LSPs . 

The LSP quantization error comparator 153 compares the 
produced decoded LSPs with one another to select, in a 
closed loop, one decoded LSP which minimizes an allophone, 
and newly uses the selected decoded LSP as a decoded LSP 
for the processing frame. 

FIG. 16 presents a block diagram of the quantization 
target LSP adding section 151. 

The quantization target LSP adding section 151 
comprises a current frame LSP memory 161 for storing the 
quantization target LSP of the processing frame obtained by 
the LPC analyzing section 1304. a pre-read area LSP memory 

162 for storing the LSP of the pre-read area obtained by 
the LPC analyzing section 1304, a previous frame LSP memory 

163 for storing the decoded LSP of the previous processing 
frame, and a linear interpolation section 164 which 
performs linear interpolation on the LSPs read from those 
three memories to add a plurality of quantization target 
LSPs. 
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A plurality of quantization target LSPs are 
additionally produced by performing linear interpolation on 
the quantization target LSP of the processing frame and the 
LSP of the pre-read, and produced quantization target LSPs 
are all sent to the LSP quantizing/decoding section 152. 

The quantization target LSP adding section 151 will 
now be explained more specifically. The LPC analyzing 
section 1304 performs linear predictive analysis on the 
processing frame in the buffer to acquire an LPC a (i) (1 ^ 
i ^ Np) of a prediction order Np (= 10), converts the 
obtained LPC to generate a quantization target LSP co(i) (1 
^ i ^ Np), and stores the generated quantization target 
LSP oo (i) (1 ^ i ^ Np) in the current frame LSP memory 161 
. in the quantization target LSP adding section 151. Further, 
15 the LPC analyzing section 1304 performs linear predictive 
analysis on the pre-read area in the buffer to acquire an 
LPC for the pre-read area, converts the obtained LPC to 
generate a quantization target LSP cof(i) (1 £ i £ Np). and 
stores the generated quantization target LSP cu(i) (1 ^ i ^ 
Np) for the pre-read area in the pre-read area LSP memory 
162 in the quantization target LSP adding section 151. 

Next, the linear interpolation section 164 reads the 
quantization target LSP w(i) (1 ^ i ^ Np) for the 
processing frame from the current frame LSP memory 161, the 
LSP oof(i) (l^i^Np) for the pre-read area from the pre- 
read area LSP memory 162, and decoded LSP coqp(i) (l £ 1 £ 
Np) for the previous processing frame from the previous 
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frame LSP memory 163, and executes conversion shown by an 
equation 33 to respectively generate first additional 
quantization target LSP coi(i) (l i £ Np), second 
additional quantization target LSP co 2 (i) (1 £ i £ Hp), and 
third additional quantization target LSP coi(i) (1 ^ i ^ 
Np). 
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where coi(i): first additional quantization target LSP 
C02(i): second additional quantization target LSP 
C03(i): third additional quantization target LSP 
i: LPC order (1 £ 1 Np) 

Np: LPC analysis order (= 10) 

coq(i) ; decoded LSP for the processing frame 

cuqp(i) ; decoded LSP for the previous processing frame 

a)f(i): LSP for the pre-read area. 

The generated co 1(1), a>2(i) and <03(i) are sent to the 
LSP quantizing/decoding section 152. After performing 
vector quantization/decoding of all the four quantization 
target LSPs co(i)', coi(ij, C02(i) and co3(i), the LSP 
quantizing/ decoding section 152 acquires power Epow ( co ) of 
an quantization error for co(i), power Epow(a)l) of an 
quantization error for coi(i). power Epow (co 2) of an 
quantization error for co 2(1). and power Epow(co3) of an 
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quantization error for C03(i), carries out conversion of an 
equation 34 on the obtained quantization error powers to 
acquire reference values STDlsp (co). STDlsp (0)1), STDlsp (CO 
2) and STDlsp (CO 3) for selection of a decoded LSP. 



(34) 



where STDlsp( co ) : reference value for selection of a decoded 
LSP for co(i) 

STDlsp ( co 1) : reference value for selection of a decoded 
LSP for CO 1 ( i) 

STDlsp(co2): reference value for selection of a decoded 
LSP for co2(i) 

STDlsp (C03): reference value for selection of a decoded 

LSP for co3(i) 

Epow( co): quantization error power for co(i) 
Epow( col): quantization error power for co l(i) 
Epow(co2): quantization error power for co2(i) 
Epow(C03): quantization error power for co3(i). 
The acquired reference values for selection of a 
decoded LSP are compared with one another to select and 
output the decoded LSP for the quantization target LSP that 
becomes minimum as a decoded LSPC0q(i) (1 ^ i ^ Np) for the 
processing frame, and the decoded LSP is stored in the 
previous frame LSP memory 163 so that it can be referred to 
at the time of performing vector quantization of the LSP of 
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the next frame. 

According to this mode, by effectively using the high 
interpolation characteristic of an LSP (which does not 
cause an allophone even synthesis is implemented by using 
interpolated LSPs), vector quantization of LSPs can be so 
conducted as not to produce an allophone even for an area 
like the top of a word where the spectrum varies 
significantly. It is possible to reduce an allophone in a 
synthesized speech which may occur when the quantization 
characteristic of an LSP becomes insufficient. 

FIG. 17 presents a block diagram of the LSP 
quantizing/ decoding section 152 according to this mode. The 
LSP quantizing/decoding section 152 has a gain information 
storage section 171, an adaptive gain selector 172, a gain 
multiplier 173, an LSP quantizing section 174 and an LSP 
decoding section 175. 

The gain information storage section 171 stores a 
plurality of gain candidates to be referred to at the time 
the adaptive gain selector 172 selects the adaptive gain. 
The gain multiplier 173 multiplies a code vector, read from 
the LSP quantization table storage section 1307, by the 
adaptive gain selected by the adaptive gain selector 172. 
The LSP quantizing section 174 performs vector quantization 
of a quantization target LSP using the code vector 
multiplied by the adaptive gain. The LSP decoding section 
175 has a function of decoding a vector-quantized LSP to 
generate a decoded LSP and outputting it, and a function of 



acquiring an LSP quantization error, which is a difference 
between the quantization target LSP and the decoded LSP, 
and sending it to the adaptive gain selector 172. The 
adaptive gain selector 172 acquires the adaptive gain by 
which a code vector is multiplied at the time of vector- 
quantizing the quantization target LSP of the processing 
frame by adaptively adjusting the adaptive gain based on 
gain generation information stored in the gain information 
storage section 171, on the basis of, as references, the 
level of the adaptive gain by which a code vector is 
multiplied at the time the quantization target LSP of the 
previous processing frame was vector-quantized and the LSP 
quantization error for the previous frame, and sends the 
obtained adaptive gain to the gain multiplier 173. 

The LSP quantizing/decoding section 152 performs 
vector-quantizes and decodes a quantization target LSP 
while adaptively adjusting the adaptive gain by which a 
code vector is multiplied in the above manner. 

The LSP quantizing/decoding section 152 will now be 
discussed more specifically. The gain information storage 
section 171 is storing four gain candidates (0.9, 1.0, 1.1 
and 1.2) to which the adaptive gain selector 172 refers. 
The adaptive gain selector 172 acquires a reference value 
for selecting an adaptive gain, Slsp, from an equation 35 
for dividing power ERpow, generated at the time of 
quantizing the quantization target LSP of the previous 
frame, by the square, of an adaptive gain Gqlsp selected at 
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Shp = J" (35) 



the time of vector- quantizing the quantization target LSP 
of the previous processing frame. 

ERpow 
Gqlsp 

where Slsp: reference value for selecting an adaptive gain 

ERpow: quantization error power generated when 
quantizing the LSP of the previous frame 

Gqlsp: adaptive gain selected when vector-quantizing 
the LSP of the previous frame. 

One gain is selected from the four gain candidates 
(0.9, 1.0, 1.1 and 1.2), read from the gain information 
storage section 171, from an equation 36 using the acquired 
reference value Slsp for selecting the adaptive gain. Then, 
the value of the selected adaptive gain Gqlsp is sent to 
the gain multiplier 173, and information (2-bit 
15 information) for specifying type of the selected adaptive 
gain from the four types is sent to the parameter coding 
section. 



Glsp = a 



1.2 Slsp > 0.0025 

1.1 Slsp > 0.0015 

1.0 Slsp > 0.0008 

0.9 Slsp s 0.0008 



(36) 



where Glsp: adaptive gain by which a code vector for LS 
20 quantization is multiplied 

Slsp: reference value for selecting an adaptive gain. 
The selected adaptive gain Glsp and the error which 
has been produced in quantization are saved in the variable 
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Gqlsp and ERpow until the quantization target LSP of the 
next frame is subjected to vector quantization. 

The gain multiplier 173 multiplies a code vector, read 
from the LSP quantization table storage section 1307, by 
5 the adaptive gain selected by the adaptive gain selector 

172, and sends the result to the LSP quantizing section 174. 
The LSP quantizing section 174 performs vector quantization 
on the quantization target LSP by using the code vector 
multiplied by the adaptive gain, and sends its index to the 
10 parameter coding section. The LSP decoding section 175 
decodes the LSP, quantized by the LSP quantizing section 
174, acquiring a decoded LSP, outputs this decoded LSP, 
subtracts the obtained decoded LSP from the quantization 
target LSP to obtain an LSP quantization error, computes 

r 

15 the power ERpow of the obtained LSP quantization error, and 
sends the power to the adaptive gain selector 172. 

This mode can suppress an allophone in a synthesized 
speech which may be produced when the quantization 
characteristic of an LSP becomes insufficient. 

20 (Tenth Mode) 

FIG. 18 presents the structural blocks of an 
excitation vector generator according to this mode. This 
excitation vector generator has a fixed waveform storage 
section 181 for storing three fixed waveforms (vl (length: 

25 LI), v2 (length: L2) and v3 (length: L3)) of channels CHI, 
CH2 and CH3 , a fixed waveform arranging section 182 for 
arranging the fixed waveforms (vl, v2, v3), read from the 
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fixed waveform storage section 181, respectively at 
positions PI, P2 and P3, and an adding section 183 for 
adding the fixed waveforms arranged by the fixed waveform 
arranging section 182, generating an excitation vector, 
5 The operation of the thus constituted excitation 

vector generator will be discussed. 

Three fixed waveforms vl, v2 and v3 are stored in 
advance in the fixed waveform storage section 181. The 
fixed waveform arranging section 182 arranges (shifts) the 

10 fixed waveform vl, read from the fixed waveform storage 
section 181, at the position PI selected from start 
position candidates for CHI, based on start position 
candidate information for fixed waveforms it has as shown 
in Table 8, and likewise arranges the fixed waveforms v2 

15 and v3 at the respective positions P2 and P3 selected from 
start position candidates for CH2 and CH3 . 
table 8 



Channel 
number 


Sign 


start position candidate information 
for fixed waveform 


CHI 


±1 . 


P1(0, 1 0. 2 0, 3 0, 6 0, 7 0) 


CH2 


±1 


/ 2, 1 2, 2 2, 3 2, 6 2, 7 2 \ , 

P2 

\ 6, 1 6, 2 6, 3 6, 6 6, 7 6 ' 


CH3 


±1 


,4, 1 4, 2 4, 3 4, 6 4, 7 4 \ 

P3 

\ 8. 1 8, 2 8, 3 8, 6 8,7 8 ' 



The adding section 183 adds the fixed waveforms, 
arranged by the fixed waveform arranging section 182, to 
20 generate an excitation vector. 
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It is to be noted that code numbers corresponding, one 
to one, to combination information of selectable start 
position candidates of the individual fixed waveforms 
(information representing which positions were selected as 
PI, P2 and P3, respectively) should be assigned to the 
start position candidate information of the fixed waveforms 
the fixed waveform arranging section 182 has. 

According to the excitation vector generator with the 
above structure, excitation information can be transmitted 
by transmitting code numbers correlating to the start 
position candidate information of fixed waveforms the fixed 
waveform arranging section 182 has, and the code numbers 
exist by the number of products of the individual start 
position candidates, so that an excitation vector close to 
an actual speech can be generated. 

Since excitation information can be transmitted by 
transmitting code numbers, this excitation vector generator 
can be used as a random codebook in a speech coder/ decoder . 

While the description of this mode has been given with 
reference to a case of using three fixed waveforms as shown 
in FIG. 18, similar functions and advantages can be 
provided if the number of fixed waveforms (which coincides 
with the number of channels in FIG. 18 and Table 8) is 
changed to other values. 

Although the fixed waveform arranging section 182 in 
this mode has been described as having the start position 
candidate information of fixed waveforms given in Table 8, 




similar functions and advantages can be provided for other 
start position .candidate information of fixed waveforms 
than those in Table 8. 
(Eleventh Mode) 

5 FIG. 19A is a structural block diagram of a CELP type 

speech coder according to this mode, and FIG . 19B is a 
structural block diagram of a CELP type speech decoder 
' which is paired with the CELP type speech coder. 

The CELP type speech coder according to this mode has 

10 an excitation vector generator which comprises a fixed 

waveform storage section 18 1A, a fixed waveform arranging 
section 182A and an adding section 183A. The fixed waveform 
storage section 181A stores a plurality of fixed waveforms. 
The fixed waveform arranging section 182A arranges (shifts) 

15 fixed waveforms, read from the fixed waveform storage 

section 181A, respectively at the selected positions, based 
on start position candidate information for fixed waveforms 
it has. The adding section 183A adds the fixed waveforms, 
arranged by the fixed waveform arranging section 182A, to 

20 generate an excitation vector c. 

This CELP type speech coder has a time reversing 
section 191 for time-reversing a random codebook searching 
target x to be input, a synthesis filter 192 for 
synthesizing the output of the time reversing section 191, 

25 a time reversing section 193 for time-reversing the output 
of the synthesis filter 192. again to yield a time-reversed 
synthesized target x ! , a synthesis filter 194 for 



synthesizing the excitation vector c multiplied by a random 
code vector gain gc, yielding a synthesized excitation 
vector s , a distortion calculator 205 for receiving x' , c 
and s and computing distortion, and a transmitter 196. 

5 According to this mode, the fixed waveform storage 

section 181A, the fixed waveform arranging section 182A and 
the adding section 183A correspond to the fixed waveform 
storage section 181. the fixed waveform arranging section 
182 and the adding section 183 shown in FIG. 18, the start 

.0 position candidates of fixed waveforms in the individual 
channels correspond to those in Table 8, and channel 
numbers, fixed waveform numbers and symbols indicating the 
lengths and positions in use are those shown in FIG. 18 and 
Table 8. 

L5 The CELP type speech decoder in FIG. 19B comprises a 

fixed waveform storage section 181B for storing a plurality 
of fixed waveforms, a fixed waveform arranging section 182B 
for arranging (shifting) fixed waveforms, read from the 
fixed waveform storage section 181B, respectively at the 

20 selected positions, based on start position candidate 

information for fixed waveforms it has, an adding section 
183B for adding the fixed waveforms, arranged by the fixed 
waveform arranging section 182B, to yield an excitation 
vector c, a gain multiplier 197 for multiplying a random 

25 code vector gain gc, and a synthesis filter 198 for 

synthesizing the excitation vector c to yield a synthesized 
excitation vector s. 
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The fixed waveform storage section 18 IB and the fixed 
waveform arranging section 18 2B in the speech decoder have 
the same structures as the fixed waveform storage section 
181A and the fixed waveform arranging section 182A in the 
speech coder, and the fixed waveforms stored in the fixed 
waveform storage sections 181A and 18 IB have such 
characteristics as to statistically minimize the cost 
function in the equation 3, which is the coding distortion 
computation of the equation 3 using a random codebook 
searching target by cost -function based learning. 

The operation of the thus constituted speech coder 
will be discussed. 

The random codebook searching target x is time- 
reversed by the time reversing section 191, then 
synthesized by the synthesis filter 192 and then time- 
reversed again by the time reversing section 193, and the 
result is sent as a time-reversed synthesized target x 1 to 
the distortion calculator 205. 

The fixed waveform arranging section 182A arranges 
(shifts) the fixed waveform vl, read from the fixed 
waveform storage section 181A, at the position PI selected 
from start position candidates for CHI, based on start 
position candidate information for fixed waveforms it has 
as shown in Table 8, and likewise arranges the fixed 
waveforms v2 and v3 at the respective positions P2 and P3 
selected from start position candidates for CH2 and CH3. 
The arranged fixed waveforms are sent to the adding section 
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183A and added to become an excitation vector c, which is 
input to the synthesis filter 194. The synthesis filter 194 
synthesizes the excitation vector c to produce a 
synthesized excitation vector s and sends it to the 
distortion calculator 205. 

The distortion calculator 205 receives the time- 
reversed synthesized target x' , the excitation vector c and 
the synthesized excitation vector s and computes coding 
distortion in the equation 4. 

The distortion calculator 205 sends a signal to the 
fixed waveform arranging section 182A after computing the 
distortion. The process from the selection of start 
position candidates corresponding to the three channels by 
the fixed waveform arranging section 182A to the distortion 
computation by the distortion calculator 205 is repeated 
for every combination of the start position candidates 
selectable by the fixed waveform arranging section 182A. 

Thereafter, the combination of the start position 
candidates that minimizes the coding distortion is selected, 
and the code number which corresponds, one to one, to that 
combination of the start position candidates and the then 
optimal random code vector gain gc are transmitted as codes 
of the random codebook to the transmitter 196. 

The fixed waveform arranging section 182B selects the 
positions of the fixed waveforms in the individual channels 
from start position candidate information for fixed 
waveforms it has, based on information sent from the 
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transmitter 196, arranges (shifts) the fixed waveform vl, 
read from the fixed waveform storage section 181B, at the 
position PI selected from start position candidates for CHI, 
and likewise arranges the fixed waveforms v2 and v3 at the 
5 respective positions P2 and P3 selected from start position 
candidates for CH2 and CH3 . The arranged fixed waveforms 
are sent to the adding section 18 3B and added to become an 
excitation vector c. This excitation vector c is multiplied 
by the random code vector gain gc selected based on the 
10 information from the transmitter 196, and the result is 

sent to the synthesis filter 198. The synthesis filter 198 
synthesizes the gc-multiplied excitation vector c to yield 
a synthesized excitation vector s and sends it out. 

According to the speech coder/decoder with the above 
15 structures, as an excitation vector is generated by the 
excitation vector -generator which comprises the fixed 
waveform storage section, fixed waveform arranging section 
and the adding section, a synthesized excitation vector 
obtained by synthesizing this excitation vector in the 
20 synthesis filter has such a characteristic statistically 

close to that of an actual target as to be able to yield a 
high-quality synthesized speech, in addition to the 
advantages of the tenth mode. 

Although the foregoing description of this mode has 
25 been given with reference to a case where fixed waveforms 
obtained by learning are stored in the fixed waveform 
storage sections 181A and 181B, high-quality synthesized 
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speeches can also obtained even when fixed waveforms 
prepared based on the result of statistical analysis of the 
random codebook searching target x are used or when 
knowledge-based fixed waveforms are used. 
5 While the description of this mode has been given with 

reference to a case of using three fixed waveforms, similar 
functions and advantages can be provided if the number of 
fixed waveforms is changed to other values. 

Although the fixed waveform arranging section in this 
10 mode has been described as having the start position 

candidate information of fixed waveforms given in Table 8, 
similar functions and advantages can be provided for other 
start position candidate information of fixed waveforms 
than those in Table 8. 
15 (Twelfth Mode) 

FIG. 20 presents a structural block diagram of a CELP 
type speech coder according to this mode. 

This CELP type speech coder includes a fixed waveform 
storage section 200 for storing a plurality of fixed 
20 waveforms (three in this mode: CH1:W1. CH2 : W2 and CH3 : W3 ) . 
and a fixed waveform arranging section 201 which has start 
position candidate information of fixed waveforms for 
generating start positions of the fixed waveforms, stored 
in the fixed waveform storage section 200, according to 
25 algebraic rules. This CELP type speech coder further has a 
fixed waveform an impulse response calculator 202 for each 
waveform, an impulse generator 203, a correlation matrix 
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calculator 204, a time reversing section 191, a synthesis 
filter 192' for each waveform, a time reversing section 193 
and a distortion calculator 205. 

The impulse response calculator 202 has a function of 
convoluting three fixed waveforms from the fixed waveform 
storage section 200 and the impulse response h (length L = 
subframe length) of the synthesis filter to compute three 
kinds of impulse responses for the individual fixed 
waveforms (CHI: hi, CH2:h2 and CH3:h3, length L = subframe 
length) . 

The synthesis filter 192' has a function of 
convoluting the output of the time reversing section 191, 
which is the result of the time -reversing the random 
codebook searching target x to be input, and the impulse 
responses for the individual waveforms, hi. h2 and h.3, from 
the impulse response calculator 202. 

The impulse generator 203 sets a pulse of an amplitude 
1 (a polarity present) only at the start position 
candidates PI, P2 and P3, selected by the fixed waveform 
arranging section 201, generating impulses for the 
individual channels (CHl:dl, CH2:d2 and CH3:d3). 

The correlation matrix calculator 204 computes 
autocorrelation of each of the impulse responses hi, h2 and 
h3 for the individual waveforms from the impulse response 
calculator 202, and correlations between hi and h2, hi and 
h3, and h2 and h3, and develops the obtained correlation 
values in a correlation matrix RR. 
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The distortion calculator 205 specifies the random 
code vector that minimizes the coding distortion, from an 
equation 37, a modification of the equation 4, by using 
three time -reversed synthesis targets (x'l, x'2 and 
x'3),the correlation matrix RR and the three impulses (dl, 
d2 and d3) for the individual channels. 
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where di: impulse (vector) for each channel 

di = ±1 x 6 (k - p ) ,k = 0 to L-l^: n start position 

candidates of the i-th channel 

H : impulse response convolution matrix for each 

waveform (H = HW ) 

i i 
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where w is the fixed waveform (length: L ) of the 
i 1 

i-th channel 

x 1 : vector obtained by time reverse synthesis of x 

i 

using H (x' * = x*H ) . 

i i i 

Here, transformation from the equation 4 to the 
equation 37 is shown for each of the denominator term 
(equation 38) and the numerator term (equation 39). 

(x l H c) 2 

= (x ' H (W x d , + W z d 2 + W 3 d 3 )) 2 

= (x' (H x d x + H z d 2 + H 3 d 3 )) 2 

= (.(x'HJd, + (x ' H 2 )d 2 + (x' H 3 )d 3 ) 2 

= {x['d x + x' 1 t d 1 + x^d,) 2 

-ill**; 1 **) 2 (38) 

where x: random codebook searching target (vector) 
x% transposed vector of x 

H: impulse response convolution matrix of the 

synthesis filter 

c: random code vector (c « + + W^) 

W : fixed waveform convolution matrix 
i 

di: impulse (vector) for each channel 

H : impulse response convolution matrix for each 

waveform (H = HW ) 

i i 

x 1 : vector obtained by time reverse synthesis of x 
i 

t t 

using H (x f = xH ), 
i i i 
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\\Hc\f 

=\\H(W x d x +W 2 d 2 + W 3 d 3 )\\ 2 
=\\H x d l + H 2 d 2 + H 3 d 3 \\ 2 

= (H x d x + H 2 d 2 + H 3 d 3 )'(H x d x + H 2 d 2 + H 3 d 3 ) 
= {d[H[ +d' 2 H' 2 + d' 3 H' 3 )(H x d x + H 2 d 2 +H 3 d 3 ) 



where H: impulse response convolution matrix of the 

synthesis filter 

c: random code vector (c = Wldl + W2d2 + W3d3) 
W : fixed waveform convolution matrix 



H : impulse response convolution matrix for each 
waveform (H = HW ) 

i i 

The operation of the thus constituted CELP type speech 
coder will be described. 

To begin with f the impulse response calculator 202 
convolutes three fixed waveforms stored and the impulse 
response h to compute three kinds of impulse responses hi, 
h2 and h3 for the individual fixed waveforms, and sends 
them to the synthesis filter 192 1 and the correlation 
matrix calculator 204. 

Next, the synthesis filter 192 1 convolutes the random 
codebook searching target x, time-reversed by the time 
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3 



(39) 



di: impulse (vector) for each channel 



reversing section 191, and the input three kinds of impulse 
responses hi, h2 and h3 for the individual waveforms. The 
time reversing section 193 time-reverses the three kinds of 
output vectors from the synthesis filter 192' again to 
5 yield three time-reversed synthesis targets x'l, x'2 and 
x'3, and sends them to the distortion calculator 205. 

Then, the correlation matrix calculator 204 computes 
autocorrelations of each of the input three kinds of 
impulse responses hi, h2 and h3 for the individual 
_ 10 waveforms and correlations between hi and h2, hi and h3, 

= s 

4* and h2 and h3, and sends the obtained autocorrelations and 

5 correlations value to the distortion calculator 205 after 

a 5 ii 

J3 . developing them in the correlation matrix RR. 

5 The above process having been executed as a pre- 

D 15 process, the fixed waveform arranging section 201 selects 

J one start position candidate of a fixed waveform for each 

n channel, and sends the positional information to the 

impulse generator 203. 

The impulse generator 203 sets a pulse of an amplitude 
20 1 (a polarity present) at each of the start position 

candidates, obtained from the fixed waveform arranging 
section 201, generating impulses dl , d2 and d3 for the 
individual channels and sends them to the distortion 
calculator 205. 
25 Then, the distortion calculator 205 computes a 

reference value for minimizing the coding distortion in the 
equation 37, by using three time-reversed synthesis targets 



x'l, x'2 and x'3 for the individual waveforms, the 
correlation matrix RR and the three impulses dl . d2 and d3 
for the individual channels. 

The process from the selection of start position 
candidates corresponding to the three channels by the fixed 
waveform arranging section 201 to the distortion 
computation by the distortion calculator 205 is repeated 
for every combination of the start position candidates 
selectable by the fixed waveform arranging section 201. 
Then, code number which corresponds to the combination of 
the start position candidates that minimizes the reference 
value for searching the coding distortion in the equation 
37 and the then optimal gain are specified with the random 
code vector gain gc used as a code of the random codebook, 
and are transmitted to the transmitter. 

The speech decoder of this mode has a similar 
structure to that of the tenth mode in FIG. 19B, and the 
fixed waveform storage section and the fixed waveform 
arranging section in the speech coder have the same 
structures as the fixed waveform storage section and the 
fixed waveform arranging section in the speech decoder. The 
fixed waveforms stored in the fixed waveform storage 
section is a fixed waveform having such characteristics as 
to statistically minimize the cost function in the equation 
3 by the training using the coding distortion equation 
(equation 3) with a random codebook searching target as a 
cost-function. 



91 



According to the thus constructed speech coder/aecoder, 
when the start position candidates of fixed waveforms in 
the fixed waveform arranging section can be computed 
algebraically, the numerator in the equation 37 can be 
computed by adding the three terms of the time-reversed 
synthesis target for each waveform, obtained in the 
previous processing stage, and then obtaining the square of 
the result. Further, the numerator in the equation 37 can 
be computed by adding the nine terms in the correlation 
matrix of the impulse responses of the individual waveforms 
obtained in the previous processing stage. This can ensure 
searching with about the same amount of computation as 
needed in a case where the conventional algebraic 
structural excitation vector (an excitation vector is 
constituted by several pulses of an amplitude 1) is used 
for the random codebook. 

Furthermore, a synthesized excitation vector in the 
synthesis filter has such a characteristic statistically 
close to that of an actual target as to be able to yield a 
high-quality synthesized speech. 

Although the foregoing description of this mode has 
been given with reference to a case where fixed waveforms 
obtained through training are stored in the fixed waveform 
storage section, high-quality synthesized speeches can also 
obtained even when fixed waveforms prepared based on the 
result of statistical analysis of the random codebook 
searching target x are used or when knowledge-based fixed 



waveforms are used. 

While the description of this mode has been given with 
reference to a case of using three fixed waveforms, similar 
functions and advantages can be provided if the number of 
fixed waveforms is changed to other values. 

Although the fixed waveform arranging section in this 
mode has been described as having the start position 
candidate information of fixed waveforms given in Table 8, 
similar functions and advantages can be provided for other 
start position candidate information of fixed waveforms 
than those in Table 8. 
(Thirteenth Mode) 

FIG. 21 presents a structural block diagram of a CELP 
type speech coder according to this mode. The speech coder 
according to this mode has two kinds of random codebooks A 
211 and B 212, a switch 213 for switching the two kinds of 
random codebooks from one to the other, a multiplier 214 
for multiplying a random code vector by a gain, a synthesis 
filter 215 for synthesizing a random code vector output 
from the random codebook that is connected by means of the 
switch 213, and a distortion calculator 216 for computing 
coding distortion in the equation 2. 

The random codebook A 211 has the structure of the 
excitation vector generator of the tenth mode, while the 
other, random codebook B 212 is constituted by a random 
sequence storage section 217 storing a plurality of random 
code vectors generated from a random sequence. Switching 
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between the random codebooks is carried out in a closed 
loop. The x is a random codebook searching target. 

The operation of the thus constituted CELP type speech 
coder will be discussed. 

First, the switch 213 is connected to the random 
codebook A 211. and the fixed waveform arranging, section 
182 arranges (shifts) the fixed waveforms . read from the 
fixed waveform storage section 181, at the positions 
selected from start position candidates of fixed waveforms 
respectively, based on start position candidate information 
for fixed waveforms it has as shown in Table 8. The 
arranged fixed waveforms are added together in the adding 
section 18 3 to become a random code vector, which is sent 
to the synthesis filter 215 after being multiplied by the 
random code vector gain. The synthesis filter 215 
synthesizes the input random code vector and sends the 
result to the distortion calculator 216. 

The distortion calculator 216 performs minimization of 
the coding distortion in the equation 2 by using the -random 
codebook searching target x and the synthesized code vector 
obtained from the synthesis filter 215. 

After computing the distortion, the distortion 
calculator 216 sends a signal to the fixed waveform 
arranging section 182. The process from the selection of 
start position candidates corresponding to the three 
channels by the fixed waveform arranging section 182 to the 
distortion computation by the distortion calculator 216 is 
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repeated for every combination.; of the start position 
candidates selectable by the fixed waveform arranging 
section 182. 

Thereafter, the combination of the start position 
candidates that minimizes the coding distortion is selected, 
and the code number which corresponds, one to one, to that 
combination of the start position, candidates , the then 
optimal random code vector gain gc and the minimum coding 
distortion value are memorized - 

Then, the switch 213 is connected to the random 
O codebook B 212, causing a random sequence read from the 

6J random sequence storage section 217 to become a random code 

U vector. This random- code vector, after being multiplied by 

M the random code vector gain. Is input to the synthesis 

filter 215. The synthesis filter 215 synthesizes the input 
random code vector and sends the result to the distortion 

calculator 216. 

The distortion calculator 216 computes the coding 
distortion in the equation 2 by using the random codebook 
searching target x and the synthesized code vector obtained 
from the synthesis filter 215. 

After computing the distortion, the distortion 
calculator 2i6 sends a signal to the random sequence 
storage section 217. The process from the selection of the 
25 random code vector by the random sequence storage section 
217 to the distortion computation by the distortion 
calculator 216 Is repeated for every random code vector 



selectable by the random sequence storage section 217 . 

Thereafter, the random code vector that minimizes the 
coding distortion is selected, and the code number of that 
random code vector, the then optimal random code vector 
gain gc and the minimum coding distortion value are 
memorized. 

Then, the distortion calculator 216 compares the 
minimum coding distortion value obtained when the switch 
213 is connected to the random codebook A 211 with the 
minimum coding distortion value obtained when the switch 
213 is connected to the random codebook B 212, determines 
switch connection information when smaller coding 
distortion was obtained, the then code number and the 
random code vector gain are determined as speech codes, and 
are sent to an unillustrated transmitter. 

The speech decoder according to this mode which is 
paired with the speech coder of this mode has the random 
codebook A, the random codebook B, the switch, the random 
code vector gain and the synthesis filter having the same 
structures and arranged in the same way as those in FIG. 21, 
a random codebook to be used, a random code vector and a 
random code vector gain are determined based on a speech 
code input from the transmitter, and a synthesized 
excitation vector is obtained as the output of the 
synthesis filter. 

According to the speech coder/decoder with the above 
structures, one of the random code vectors to be generated 
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from the random codebook A and the random code vectors to 
be generated from the random codebook B, which minimizes 
the coding distortion in the equation 2, can be selected in 
a closed loop, making it possible to generate an excitation 
vector closer to an actual speech and a high-quality 
synthesized speech. 

Although this mode has been illustrated as a speech 
coder/ decoder based on the structure in FIG. 2 of the 
conventional CELP type speech coder, similar functions and 
advantages can be provided even if this mode is adapted to 
a CELP type speech coder/decoder based on the structure in 
FIGS. 19A and 19B or FIG. 20. 

Although the random codebook A 211 in this mode has 
the same structure as shown in FIG. 18, similar functions 
and advantages can be provided even if the fixed waveform 
storage section 181 takes another structure (e.g., in a 
case where it has four fixed waveforms). 

While the description of this mode has been given with 
reference to a case where the fixed waveform arranging 
section 182 of the random codebook A 211 has the start 
position candidate information of fixed waveforms as shown 
in Table 8, similar functions and advantages can be 
provided even for a case where the section 182 has other 
start position candidate information of fixed waveforms. 



Although this mode has been described with reference 
to a case where the random codebook B 212 is constituted by 
the random sequence storage section 217 for directly 
storing a plurality of random sequences in the memory, 
similar functions and advantages can be provided even for a 
case where the random codebook B 212 takes other excitation 
vector structures (e.g., when it is constituted by 
excitation vector generation information with an algebraic 
structure) . 

Although this mode has been described as a CELP type 
speech coder/decoder having two kinds of random codebooks, 
similar functions and advantages can be provided even in a 
case of using a CELP type speech coder/ decoder having three 
or more kinds of random codebooks . 
(Fourteenth Mode) 

FIG. 22 presents a structural block diagram of a CELP 
type speech coder according to this mode. The speech coder 
according to this mode has two kinds of random codebooks. 
One random codebook has the structure of the excitation 
vector generator shown in FIG. 18, and the other one is 
constituted of a pulse sequences storage section which 
retains a plurality of pulse sequences. The random 
codebooks are adaptively switched from one to the other by 
using a quantized pitch gain already acquired before random 

codebook search. 

The random codebook A 211, which comprises the fixed 
waveform storage section 181, fixed waveform arranging 
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section 182 and adding section 183, corresponds to the 
excitation vector generator in FIG. 18. A random codebook B 
221 is comprised of a pulse sequences storage section 222 
where a plurality of pulse sequences are stored. The random 
5 codebooks A 211 and B 221 are switched from one to the 

other by means of a switch 213' . A multiplier 224 outputs 
an adaptive code vector which is the output of an adaptive 
codebook 223 multiplied by the pitch gain that has already 
been acquired at the time of random codebook search. The 
10 output of a pitch gain quantizer 225 is given to the switch 
213'. 

The operation of the thus constituted CELP type speech 
coder will be described. 

According to the conventional CELP type speech coder, 
15 the adaptive codebook 223 is searched first, and the random 
codebook search is carried out based on the result. This 
adaptive codebook search is a process of selecting an 
optimal adaptive code vector from a plurality of adaptive 
code vectors stored in the adaptive codebook 223 (vectors 
20 each obtained by multiplying an adaptive code vector and a 
random code vector, by their respective gains and then 
adding them together). As a result of the process, the code 
number and pitch gain of an adaptive code vector are 
generated. 

25 According to the CELP type speech coder of this mode, 

the pitch gain quantizer 225 quantizes this pitch gain, 
generating a quantized pitch gain, after which random 
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codebook search will be performed. The quantized pitch gain 
obtained by the pitch gain quantizer 225 is sent to the 
switch 213' for switching between the random codebooks. 

"The switch 213' connects to the random codebook A 211 
when the value of the quantized pitch gain is small, by 
which it is considered that the input speech is unvoiced, 
and connects to the random codebook B 221 when the value of 
the quantized pitch gain is large, by which it is 
considered that the input speech is voiced. 

When the switch 213' is connected to the random 
codebook A 211, the fixed waveform arranging section 182 
arranges (shifts) the fixed waveforms, read from the fixed 
waveform storage section 181, at the positions selected 
from start position candidates of fixed waveforms 
respectively, based on start position candidate information 
for fixed waveforms it has as shown in Table 8. The 
arranged fixed waveforms are sent to the adding section 183 
and added together to become a random code vector. The 
random code vector is sent to the synthesis filter 215 
after being multiplied by the random code vector gain. The 
synthesis filter 215 synthesizes the input random code 
vector and sends the result to the distortion calculator 
216. 

The distortion calculator 216 computes coding 
distortion in the equation 2 by using the target x for 
random codebook search and the synthesized code vector 
obtained from the synthesis filter 215. 
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After computing the distortion, the distortion 
calculator 216 sends a signal to the fixed waveform 
arranging section 182. The process from the selection of 
start position candidates corresponding to the three 
5 channels by the fixed waveform arranging section 182 to the 
distortion computation by the distortion calculator 216 is 
repeated for every combination of the start position 
candidates selectable by the fixed waveform arranging 
section 182. 

0 Thereafter, the combination of the start position 

candidates that minimizes the coding distortion is selected, 
and the code number which corresponds, one to one, to that 
combination of the start position candidates, the then 
optimal random code vector gain gc and the quantized pitch 

L5 gain are transferred to a transmitter as a speech code. In 
this mode, the property of unvoiced sound should be 
reflected on fixed waveform patterns to be stored in the 
fixed waveform storage section 181. before speech coding 
takes places . 

20 When the switch 213' is connected to the random 

codebook B 212, a pulse sequence read from the pulse 
sequences storage section 222 becomes a random code vector. 
This random code vector is input to the synthesis filter 
215 through the switch 213' and multiplication of the 

25 random code vector gain. The synthesis filter 215 

synthesizes the input random code vector and sends the 
result to the distortion calculator 216. 
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The distortion calculator 216 computes the coding 
distortion in the equation 2 by using the target x for 
random codebook search X and the synthesized code vector 
obtained from the synthesis filter 215. 

After computing the distortion, the distortion 
calculator 216 sends a signal to the pulse sequences 
storage section 222. The process from the selection of the 
random code vector by the pulse sequences storage section 
222 to the distortion computation by the distortion 
calculator 216 is repeated for every random code vector 
selectable by the pulse sequences storage section 222. 

Thereafter, the random code vector that minimizes the 
coding distortion is selected, and the code number of that 
random code vector, the then optimal random code vector 
gain gc and the quantized pitch gain are transferred to the 
transmitter as a speech code. 

The speech decoder according to this mode which is 
paired with the speech coder of this mode has the random 
codebook A, the random codebook B, the switch, the random 
code vector gain and the synthesis filter having the same 
structures and arranged in the same way as those in FIG. 22 
First, upon reception of the transmitted quantized pitch 
gain, the coder side determines from its level whether the 
switch 213 1 has been connected to the random codebook A 211 
or to the random codebook B 221. Next, based on the code 
number and the sign of the random code vector, a 
synthesized excitation vector is obtained as the output of 
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the synthesis filter. 
^ According to the speech coder/decoder with the above 

structures, two kinds of random codebooks can be switched 
adaptively in accordance with the characteristic of an 
5 input speech (the level of the quantized pitch gain is used 
to determine the transmitted quantized pitch gain in this 
mode), so that when the input speech is voiced, a pulse 
sequence can be selected as a random code vector whereas 
for a strong voiceless property, a random code vector which 

10 reflects the property of voiceless sounds can be selected. 
This can ensure generation of excitation vectors closer to 
the actual sound property and improvement of synthesized 
sounds. Because switching is performed in a closed loop in 
this mode as mentioned above, the functional effects can be 

15 improved by increasing the amount of information to be 
transmitted. 

Although this mode has been illustrated as a speech 
coder /decoder based on the structure in FIG. 2 of the 
conventional CELP type speech coder, similar functions and 

20 advantages can be provided even if this mode is adapted to 
a CELP type speech coder/decoder based on the structure in 
FIGS. 19A and 19B or FIG. 20. 

In this mode, a quantized pitch gain acquired by 
quantizing the pitch gain of an adaptive code vector in the 

25 pitch gain quantizer 225 is used as a parameter for 

switching the switch 213'. A pitch period calculator may b 
provided so that a pitch period computed from an adaptive 




code vector can be used instead. 

Although the random codebook A 211 in this mode has 
the same structure as shown in FIG. 18, similar functions 
and advantages can be provided even if the fixed waveform 
5 storage section 181 takes another structure (e.g., in a 
case where it has four fixed waveforms). 

While the description of this mode has been given with 
reference to the case where the fixed waveform arranging 
section 182 of the random codebook A 211 has the start 
LO position candidate information of fixed waveforms as shown 
in Table 8, similar functions and advantages can be 
provided even for a case where the section 182 has other 
start position candidate information of fixed waveforms. 
Although this mode has been described with reference 
15 to the case where the random codebook B 212 is constituted 
by. the pulse sequences storage section 222 for directly 
storing a pulse sequence in the memory, similar functions 
and advantages can be provided even for a case where the 
random codebook B 212 takes other excitation' vector 
20 structures (e.g.. when it is constituted by excitation 

vector generation information with an algebraic structure) . 

Although this mode has been described as a CELP type 
speech coder/decoder having two kinds of random codebooks, 
similar functions and advantages can be provided even in a 
25 case of using a CELP type speech coder/decoder having three, 
or more kinds of random codebooks. 
(Fifteenth Mode) 
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FIG, 23 presents a structural block diagram of a CELP 
type speech coder according to this mode. The speech coder 
according to this mode has two kinds of random codebooks . 
One random codebook takes the structure of the excitation 
vector, generator shown in FIG . 18 and has three fixed 
waveforms stored in the fixed waveform storage section, and 
the other one likewise takes the structure of the 
excitation vector generator shown in FIG. 18 but has two 
fixed waveforms stored in the fixed waveform storage 
section. Those two kinds of random codebooks are switched 
in a closed loop. 

The random codebook A 211, which comprises a fixed 
waveform storage section A 181 having three fixed waveforms 
stored therein, fixed waveform arranging section A 182 and 
adding section 183, corresponds to the structure of the 
excitation vector generator in FIG. 18 which however has 
three fixed waveforms stored in the fixed waveform storage 
section. 

A random codebook B 230 comprises a fixed waveform 
storage section B 231 having two fixed waveforms stored 
therein, fixed waveform arranging section B 232 having 
start position candidate information of fixed waveforms as 
shown in Table 9 and adding section 233, which adds two 
fixed waveforms, arranged by the fixed waveform arranging 
section B 232, thereby generating a random code vector. The 
random codebook B 230 corresponds to the structure of the 
excitation vector generator in FIG. 18 which however has 
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two fixed waveforms stored in the fixed waveform storage 

section. 

Table 9 



Channel 
number 


Sign 


Channel number Sign Start position 
candidates fixed waveforms 


CHI 


±1 


,0, 4, 8, 12, 1 6, 7 2, 7 6 

PI 

6, 1 0, 14, 18, 7 4, 7 8 


CH2 


±1 


.1, 5, 9, 13, 17, 73. 77 

P2 

*3. 7, 11, 15, 1 9, 7 5,79 



The other structure is the same as. that of the above - 
described thirteenth mode. 

The operation of the CELP type speech coder 
constructed in the above way will be described. 

First, the switch 213 is connected to the random 
codebook A 211, and the fixed waveform arranging section A 
182 arranges (shifts) three fixed waveforms, read from the 
fixed waveform storage section A 181, at the positions 
selected from start position candidates of fixed waveforms 
respectively, based on start position candidate information 
for fixed waveforms it has as shown in Table 8. The 
arranged three fixed waveforms are output to the adding 
section 183 and added together to become a random code 
vector. This random code vector is sent to the synthesis 
filter 215 through the switch 213 and the multiplier 214 
for multiplying it by the random code vector gain. The 
synthesis filter 215 synthesizes the input random code 




vector and sends the result to the distortion calculator 
216. 

The distortion calculator 216 computes coding 
distortion in the equation 2 by using the random codebook 
search target X and the synthesized code vector obtained 
from the synthesis filter 215. 

After computing the distortion, the distortion 
calculator 216 sends a signal to the fixed waveform 
arranging section A 182. The process from the selection of 
start position candidates corresponding to the three 
channels by the fixed waveform arranging section A 182 to 
the distortion computation by the distortion calculator 216 
is repeated for every combination of the start position 
candidates selectable by the fixed waveform arranging 
section A 182. 

Thereafter, the combination of the start position 
candidates that minimizes the coding distortion is selected, 
and the code number which corresponds, one to one, to that 
combination of the start position candidates , the then 
optimal random code vector gain gc and the minimum coding 
distortion value are memorized. 

In this mode, the fixed waveform patterns to be stored 
in the fixed waveform storage section A 181 before speech 
coding are what have been acquired through training in such 
a way as to minimize distortion under the condition of 
three fixed waveforms in use. 

Next, the switch 213 is connected to the random 
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codebook B 230, and the fixed waveform arranging section B 
232 arranges (shifts) two fixed waveforms, read from the 
fixed waveform storage section B 231. at the positions 
selected from start position candidates of fixed waveforms 

5 respectively, based on start position candidate information 
for fixed waveforms it has as shown in Table 9. The 
arranged two fixed waveforms are output to the adding 
section 233 and added together to become a random code 
vector. This random code vector is sent to the synthesis 

L0 filter 215 through the switch 213 and the multiplier 214 
for multiplying it by the random code vector gain. The 
synthesis filter 215 synthesizes the input random code 
. vector and sends the result to the distortion calculator 
216. 

15 The distortion calculator 216 computes coding 

distortion in the equation 2 by using the target x for 
random codebook search X and the synthesized code vector 
obtained from the synthesis filter 215. 

After computing the distortion, the distortion 

20 calculator 216 sends a signal to the fixed waveform 

arranging section B 232. The process from the selection of 
start position candidates corresponding to the three 
channels by the fixed waveform arranging section B 232 to 
the distortion computation by the distortion calculator 216 

25 is repeated for every combination of the start position 
candidates selectable by the fixed waveform arranging 
section B 232. 
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Thereafter, the combination of the start position 
candidates that minimizes the coding distortion is selected, 
and the code number which corresponds, one to one, to that 
combination of the start position candidates, the then 
optimal random code vector gain gc and the minimum coding 
distortion value are memorized. In this mode, the fixed 
waveform patterns to be stored in the fixed waveform 
storage section B 231 before speech coding are what have 
been acquired through training in such a way as to minimize 
distortion under the condition of two fixed waveforms in 
use. 

Then, the distortion calculator 216 compares the 
minimum coding distortion value obtained when the switch 
213 is connected to the random codebook B 230 with the 
minimum coding distortion value obtained when the switch 
213 is connected to the random codebook B 212, determines 
switch connection information when smaller coding 
distortion was obtained, the then code number and the 
random code vector gain are determined as speech codes, and 
are sent to the transmitter. 

The speech decoder according to this mode has the 
random codebook A, the random codebook B, the switch, the 
random code vector gain and the synthesis filter having the 
same structures and arranged in the same way as those in 
FIG. 23, a random codebook to be used, a random code vector 
and a random code vector gain are determined based on a 
speech code input from the transmitter, and a synthesized 




excitation vector is obtained as the output of the 
synthesis filter. 

According to the speech coder/decoder with the above 
structures, one of the random code vectors to be generated 
from the random codebook A and the random code vectors to 
be generated from the random codebook B, which minimizes 
the coding distortion in the equation 2, can be selected in 
a closed loop, making it possible to generate an excitation 
vector closer to an actual speech and a high-quality 
synthesized speech. 

Although this mode has been illustrated as a speech 
coder/decoder based on the structure in FIG. 2 of the 
conventional CELP type speech coder, similar functions and 
advantages can be provided even if this mode is adapted to 
a CELP type speech coder/decoder based on the structure in 
FIGS. 19A and 19B or FIG. 20. 

Although this mode has been described with reference 
to the case where the fixed waveform storage section A 181 
of the random codebook A 211 stores three fixed waveforms, 
similar functions and advantages can be provided even if 
the fixed waveform storage section A 181 stores a different 
number of fixed waveforms (e.g., in a case where it has 
four fixed waveforms). The same is true of the random 

codebook B 230. 

While the description of this mode has been given with 
reference to the case where the fixed waveform arranging 
section A 182 of the random codebook A 211 has the start 
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position candidate information of fixed waveforms as shown 
in Table 8, similar functions and advantages can be 
provided even for a case where the section 182 has other 
start position candidate information of fixed waveforms. 
The same is applied to the random codebook B 230. 

Although this mode has been described as a CELP type 
speech coder/decoder having two kinds of random codebooks, 
similar functions and advantages can be provided even in a 
case of using a CELP type speech coder/decoder having three 
or more kinds of random codebooks . 
(Sixteenth Mode) 

FIG. 24 presents a structural block diagram of a CELP 
type speech coder according to this mode. The speech coder 
acquires LPC coefficients by performing autocorrelation 
analysis and LPC analysis on input speech data 241 in an 
LPC analyzing section 242, encodes the obtained LPC 
coefficients to acquire LPC codes, and encodes the obtained 
LPC codes to yield decoded LPC coefficients. 

Next, an excitation vector generator 245 acquires an 
adaptive code vector and a random code vector from an 
adaptive codebook 243 and an excitation vector generator 
244, and sends them to an LPC synthesis filter 246. One of 
the excitation vector generators of the above -described 
first to fourth and tenth modes is used for the excitation 
vector generator 244. Further, the LPC synthesis filter 246 
filters two excitation vectors, obtained by the excitation 
vector generator 245, with the decoded LPC coefficients 
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obtained by the LPC analyzing section 242, thereby yielding 
two synthesized speeches, 

A comparator 247 analyzes a relationship between the 
two synthesized speeches, obtained by the LPC synthesis 
filter 246, and the input speech, yielding optimal values 
(optimal gains) of the two synthesized speeches, adds the 
synthesized speeches whose powers have been adjusted with 
the optimal gains, acquiring a total synthesized speech, 
and then computes a distance between the total synthesized 
speech and the input speech. 

Distance computation is also carried out on the input 
speech and multiple synthesized speeches, which are 
obtained by causing the excitation vector generator 245 and 
the LPC synthesis filter 246 to function with respect to 
all the excitation vector samples those are generated by 
the random codebook 243 and the excitation vector generator 
244.. Then, the index of the excitation vector sample which 
provides the minimum one of the distances obtained from the 
computation. The obtained optimal gains, the obtained index 
of the excitation vector sample and two excitation vectors 
corresponding to that index are sent to a parameter coding 
section 248. 

The parameter coding section 248 encodes the optimal 
gains to obtain gain codes, and the LPC codes and the index 
of the excitation vector sample are all sent to a 
transmitter 249. An actual excitation signal is produced 
from the gain codes and the two excitation vectors 
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corresponding to the index, and an old excitation vector 
sample is discarded at the same time the excitation signal 
is stored in the adaptive codebook 243. 

FIG. 25 shows functional blocks of a section in the 
parameter coding section 248, which is associated with 
vector quantization of the gain. 

The parameter coding section 248 has a parameter 
converting section 2502 for converting input optimal gains 
2501 to a sinn of elements and a ratio with respect to the 
sum to acquire quantization target vectors, a target vector 
extracting section 2503 for obtaining a target vector by 
using old decoded code vectors, stored in a decoded vector 
storage section, and predictive coefficients stored in a 
predictive coefficients storage section, a decoded vector 
storage section 2504 where old decoded code vectors are 
stored, a predictive coefficients storage section 2505, a 
distance calculator 2506 for computing distances between a 
plurality of code vectors stored in a vector codebook and a 
target vector obtained by the target vector extracting 
section by using predictive coefficients stored in the 
predictive coefficients storage section, a vector codebook 
2507 where a plurality of code vectors are stored, and a 
comparator 2508, which controls the vector codebook and the 
distance calculator for comparison of the distances 
obtained from the distance calculator to acquire the number 
of the most appropriate code vector, acquires a code vector 
from the vector storage section based on the obtained 
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number, and updates the content of the decoded vector 
storage section using that code vector. 

A detailed description will now be given of the 
operation of the thus constituted parameter coding section 
248. The vector codebook 2507 where a plurality of general 
samples (code vectors) of a quantization target vector are 
stored should be prepared in advance. This is generally 
prepared by an LBG algorithm (IEEE TRANSACTIONS ON 
COMMUNICATIONS , VOL. COM-28. NO. 1, PP 84-95, JANUARY 1980) 
based on multiple vectors which are obtained by analyzing 
multiple speech data. 

Coefficients for predictive coding should be stored in 
the predictive coefficients storage section 2505. The 
predictive coefficients will now be discussed after 
describing the algorithm. A value indicating a unvoiced 
stateshould be stored as an initial value in the decoded 
vector storage section 2504. One example would be a code 
vector with the lowest power. 

First, the input optimal gains 2501 (the gain of an 
adaptive excitation vector and the gain of a random 
excitation vector) are converted to element vectors 
(inputs) of a sum and a ratio in the parameter converting 
section 2502. The conversion method is illustrated in an 
equation 40. 



P = log(Ga + Gs) 
R » Ga/(Ga + Gs) 
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where (Ga, Gs): optical gain 

Ga: gain of an adaptive excitation vector 
Gs: gain of stochastic excitation vector 
(P, R) : input vectors 
P : sum 
R: ratio. 

It is to be noted that Ga above should not necessarily 
be a positive value. Thus, R may take a negative value. 
When Ga + Gs becomes negative, a fixed value prepared in 
advance is substituted. 

Next, based on the vectors obtained by the parameter 
converting section 2502, the target vector extracting 
section 2503 acquires a target vector by using old decoded 
code vectors, stored in the decoded vector storage section 
2504, and predictive coefficients stored in the predictive 
coefficients storage section 2504. An equation for 
computing the target vector is given by an equation 41. 





(4 1) 



where (Tp, Tr) : target vector 



(P, R) : input vector 



(pi, ri): old decoded vector 

Upi, Vpi, Uri, Vri: predictive coefficients (fixed 



values ) 



i: index indicating how old the decoded vector is 
1: prediction order. 

Then, the distance calculator 2506 computes a distance 
between a target vector obtained by the target vector 
extracting section 2503 and a code vector stored in the 
vector codebook 2507 by using the predictive coefficients 
stored in the predictive coefficients storage section 2505. 
An equation for computing the distance is given by an 
equation 42. 

2 

Dn = Wp X (Tp - UpO X Cpn - VpO X Cm) 

+ Wr X (Tr - UpO X Cpn - VrO X Crn) 2 (42) 

where Dn: distance between a target vector and a code 
vector 

(Tp, Tr): target vector 

UpO, VpO, UrO, VrO: predictive coefficients (fixed 
values) 

(Cpn, Crn): code vector 

n: the number of the code vector 

Wp, Wr: weighting coefficient (fixed) for adjusting 
the sensitivity against distortion. 

Then, the comparator 2508 controls the vector codebook 
2507 and the distance calculator 2506 to acquire the number 
of the code vector which has the shortest distance computed 
by the distance calculator 2506 from among a plurality of 
code vectors stored in the vector codebook 2507, and sets 
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the number as a gain code 2509. Based on the obtained gain 
code 2509, the comparator 2508 acquires a decoded vector 
and updates the content of the decoded vector storage 
section 2504 using that vector. An equation 43 shows how to 
5 acquire a decoded vector. 

p = (^Upi x pi + gVpi x ri) + UpO x Cpn + VpO x Cm 

R = (VUrix pi + J Vri x ") + Ur0 x Cpn + Vr0xCrn ( 4 3 ) 

where (Cpn, Cm): code vector 
(P, r): decoded vector 
10 (pi, ri): old decoded vector 

Upi, Vpi, Uri, Vri: predictive coefficients (fixed 

values ) 

i: index indicating how old the decoded vector is 
1: prediction order. 
15 n: the number of the code vector. 

An equation 44 shows an updating scheme. 

Processing order 
pO = CpN 

20 rO = CrN 

pi = pi - 1 (i = 1 ~ 1) 
ri = ri - 1 (i = 1 ~ 1) < 44 > 
N: code of the gain. 
Meanwhile, the decoder, which should previously be 

25 provided with a vector codebook, a predictive coefficients 
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storage section and a coded vector storage section similar 



« 10 



O 15 



to those of the coder, performs decoding through the 
functions of the comparator of the coder of generating a 
decoded vector and updating the decoded vector storage 
section, based on the gain code transmitted from the coder* 

A scheme of setting predictive coefficients to be 
stored in the predictive coefficients storage section 2505 
will now be described. 

Predictive coefficients, are obtained by quantizing a 
lot of training speech data first, collecting-input vectors 
obtained from their optimal gains and decoded vectors at 
the time of quantization, forming a population, then 
minimizing total distortion indicated by the following 
equation 45 for that population. Specifically, the values 
of Upi and Uri are acquired by solving simultaneous 
equations which are derived by partial differential of the 
equation of the total distortion with respect to Upi and 
Uri. 

r t 
Total = J { Wp x (Pt - ^ Upi x pt,i) 2 + 



pt,0 = Cpn {t) 
rt,0 = Crn (t) 

where Total: total distortion 
t: time (frame number) 

T: the number of pieces of data in the population 





(4 5) 
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(Pt, Rt): optimal gain at time t 
(pti, rti): decoded vector at time t 
Upi, Vpi, Uri, Vri: predictive coefficients (fixed 
values) 

i: index indicating how old the decoded vector is 
1: prediction order. 

(Con . Cm ): code vector at time t 
* <t> <t) 

n : . the number of the code vector 

Wp, Wr: weighting coefficient (fixed) for adjusting 
the sensitivity against distortion. 

According to such a vector quantization scheme, the 
optimal gain can be vector- quantized as it is, the feature 
of the parameter converting section can permit the use of 
the correlation between the relative levels of the power 
and each gain, and the features of the decoded vector 
storage section, the predictive coefficients storage 
section, the target vector extracting section and the 
distance calculator can ensure predictive coding of gains 
using the correlation between the mutual relations between 
the power and two gains. Those features can allow the 
correlation among parameters to be utilized sufficiently. 
(Seventeenth Mode) 

FIG. 26 presents a structural block diagram of a 
parameter coding section of a speech coder according to 
this mode. According to this mode, vector quantization is 
performed while evaluating gain-quantization originated 
distortion from two synthesized speeches corresponding to 



the index of an excitation vector and a perpetual weighted 
input speech. 

As shown in FIG. 26, the parameter coding section has 
a parameter calculator 2602, which computes parameters 
necessary for distance computation from input data or a 
perpetual weighted input speech, a perpetual weighted LPC 
synthesis of adaptive code vector and a perpetual weighted 
LPC synthesis of random code vecror 2601 to be input, a 
decoded vector stored in a decoding vector storage section, 
and predictive coefficients stored in a predictive 
coefficients storage section, a decoded vector storage 
section 2603 where old decoded code vectors are stored, a 
predictive coefficients storage section 2604 where 
predictive coefficients are stored, a distance calculator 
2605 for computing coding distortion of the time when 
decoding is implemented with a plurality of code vectors 
stored in a vector codebook by using the predictive 
coefficients stored in the predictive coefficients storage 
section, a vector codebook 2606 where a plurality of code 
vectors are stored, and a comparator 2607, which controls 
the vector codebook and the distance calculator for 
comparison of the coding distortions obtained from the 
distance calculator to acquire the number of the most 
appropriate code vector, acquires a code vector from the 
vector storage section based on the obtained number, and 
updates the content of the decoded vector storage section 
using that code vector. 
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A description will now be given of the vector 
quantizing operation of the thus constituted parameter 
coding section. The vector codebook 2606 where a plurality 
of general samples (code vectors) of a quantization target 
vector are stored should be prepared in advance. This is 
generally prepared by an LBG algorithm (IEEE TRANSACTIONS 
ON COMMUNICATIONS, VOL. COM-28, NO. 1, PP 84-95, JANUARY 
1980) or the like based on multiple vectors which are 
obtained by analyzing multiple speech data. Coefficients 
for predictive coding should be stored in the predictive 
coefficients storage section 2604. Those coefficients in 
use are the same predictive coefficients as stored in the 
predictive coefficients storage section 2505 which has been 
discussed in (Sixteenth Mode). A value indicating a 
unvoiced stateshould be stored as an initial value in the 
decoded vector storage section 2603. 

First, the parameter . calculator 2602 computes 
parameters necessary for distance computation from the 
input perpetual weighted input speech, perpetual weighted 
LPC synthesis of adaptive code vector and perpetual 
weighted LPC synthesis of random code vector, and further 
from the decoded vector stored in the decoded vector 
storage section 2603 and the predictive coefficients stored 
in the predictive coefficients storage section 2604. The 
distances in the distance calculator are based on the 
following equation 46. 



En = y (Xi - Gan x Ai - Gsn x Si) 

Gfl/i = Orn x e x p(Opn) 
Gsn = (1 - Or/z) x e x p(Opn) 

Opn = Yp + UpOxCpn+VpOxCrn (4 Q) 

Gan, Gsn: decoded gain 

(Opn, Orn): decoded vector 
(Yp, Yr): predictive vector 

En: coding distortion when the n-th gain code vector 
is used 

Xi: perpetual weighted input speech 

Ai: perpetual weighted LPC synthesis of adaptive code 
vector 

Si: perpetual weighted LPC synthesis of stochastic 
code vector 

n: code of the code vector 
i: index of excitation data 

I: subframe length (coding unit of the input speech) 

(Cpn, Crn) : code vector 

(pj. rj): old decoded vector 

Upj, Vpj, Urj, Vrj: predictive coefficients (fixed 
values) 

j: index indicating how old the decoded vector is 
J: prediction order. 

Therefore, the parameter calculator 2602 computes 
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those portions which do not depend on the number of a code 
vector. What is to be computed are the predictive vector, 
and the correlation among three synthesized speeches or the 
power. An equation for the computation is given by an 
5 equation 47 . 
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where (Yp, Yr) : predictive vector 

Dxx, Dxa, Dxs, Daa, Das, Dss: value of correction 
among synthesized speeches or the power 
10 Xi: perpetual weighted input speech 

Ai: perpetual weighted LPC synthesis of adaptive code 
vector 

Si: perpetual weighted LPC synthesis of stochastic 
code vector 
15 i:. index of excitation data 

I: subframe length (coding unit of the input speech) 
(pj, rj): old decoded vector 

Upj, Vpj, Urj, Vrj: predictive coefficients (fixed 
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values) 

j; index indicating how old the decoded vector is 
J: prediction order. 

Then, the distance calculator 2506 computes a distance 
5 between a target vector obtained by the target vector 

extracting section 2503 and a code vector stored in the 
vector codebook 2507 by using the predictive coefficients 
stored in the predictive coefficients storage section 2505. 
An equation for computing the distance is given by an 
10 equation 42. 



En = Dxx + (Gan) 2 x Daa + (Gsn) 2 x Dss 

- Gan x Dxa - Gsn x Dxs + Gan x Gsn x Das 

Gan = Orn x txp(Opn) 

Gsn = (1- Orri) x exp(Opn) ■ 

Opn =Yp + UpO xCpn+VpO xCrn 48 

Orn = Yr + UrO x Cpn + KrO x Cm 
where En: coding distortion when the n-th gain code vector 



Dxx, Dxa, Dxs, Daa, Das, Dss: value of correction 
among synthesized speeches or the power 
Gan, Gsn: decoded gain 
(Opn, Orn): decoded vector 



UpO, VpO, UrO, VrO: predictive coefficients (fixed 

values) 

(Cpn, Crn): code vector 

n: the number of the code vector. 

Actually, Dxx does not depend on the number n of the 



15 



is used 



20 



(Yp, Yr) : predictive vector 



124 



code vector so that its addition can be omitted. 

Then, the comparator 2607 controls the vector codebook 
2606 and the distance calculator 2605 to acquire the number 
of the code vector which has the shortest distance computed 
by the distance calculator 2605 from among a plurality of 
code vectors stored in the vector codebook 2606, and sets 
the number as a gain code 2608. Based on the obtained gain 
code 2608, the comparator 2607 acquires a decoded vector 
and updates the content of the decoded vector storage 
section 2603 using that vector. A code vector is obtained 
from the equation 44. 

Further, the updating scheme, the equation 44, is used. 
Meanwhile, the speech decoder should previously be 
provided with a vector codebook, a predictive coefficients 
storage section and a coded vector storage section similar 
to those of the speech coder, and performs decoding through 
the functions of the comparator of the coder of generating 
a decoded vector and updating the decoded vector storage 
section, based on the gain code transmitted from the coder. 

According to the thus constituted mode, vector 
quantization can be performed while evaluating gain- 
quantization originated distortion from two synthesized 
speeches corresponding to the index of the excitation 
vector and the input speech, the feature of the parameter 
converting section can permit the use of the correlation 
between the relative levels of the power and each gain, and 
the features of the decoded vector storage section, the 
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predictive coefficients storage section, the target vector 
extracting section and the distance calculator can ensure 
predictive coding of gains using the correlation between 
the mutual relations between the power and two gains. This 
5 can allow the correlation among parameters to be utilized 
sufficiently. 
(Eighteenth Mode) 

FIG. 27 presents a structural block diagram of the 
essential portions of a noise canceler according to this 
10 mode. This noise canceler is installed in the above- 
described speech coder. For example, it is placed at the 
preceding stage of the buffer 1301 in the speech coder 
shown in FIG. 13. 

The noise canceler shown in FIG. 27 comprises an A/D 
15 converter 272, a noise cancellation coefficient storage 
section 273, a noise cancellation coefficient adjusting 
section 274, an input waveform setting section 275, an LPC 
analyzing section 276, a Fourier transform section 277, a 
noise canceling/ spectrum compensating section 278, a 
20 spectrum stabilizing section 279, an inverse Fourier 

transform section 280, a spectrum enhancing section 281, a 
waveform matching section 282, a noise estimating section 
284, a noise spectrum storage section 285, a previous 
spectrum storage section 286, a random phase storage 
25 section 287, a previous waveform storage section 288, and 
maximum power storage section 289. 

To begin with, initial settings will be discussed. 
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Table 10 shows the names of fixed parameters and setting 
examples . 



Table 10 



Fixed Parameters 


oei. tiny examples 


frame length 


160 (20 msec for 8-kHz 
sampling data) 


pre-read data length 


80 (10 msec for the 
above data) 


FET order 


2 5 6 


LPC prediction order 


1 0 


sustaining number of noise 




spectrum reference 


3 0 


designated minimum power 


2 0.0 


AR enhancement coefficient 0 


0. 5 


ma enhancement coefficient 0 


0. 8 


hiah-f reauencv enhancement 


0. 4 


coefficient 0 




AR enhancement coefficient 1-0 


0.6 6 


MA enhancement coefficient 1-0 


0. 6 4 
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0 . 7 


MR anhflriPPTTiAnt 1 nnftfficient 1-1 


0.6 


nign-xrecjueucy eiuicuioeuieii u 


0 3 


coefficient 1 




y^r^iuroT" pnhanrPTTipn t" coefficient 


1 . 2 


IlUlo t3 J.CJ-C1.C110C ^wwc-j. 


2 0 0 0 0. 0 


unvoxcea segmenr power 


\J • o 


reduction coefficient 




compensation power increase 


2. 0 






number of consecutive noise 


5 


references 




noise cancellation coefficient 


0. 8 


training coefficient 




unvoiced segment detection 


0. 0 5 


coefficient 




designated noise cancellation 


1.5 


coefficient 






Phase data for adjusting the phase should have been 
stored in the random phase storage section 287. Those are 
used to rotate the phase in the spectrum stabilizing 
section 279. Table 11 shows a case where there are eight 
5 kinds of phase data. 



Table 11 











Phase 


Data 








( 


- 0. 


5 1, 


0. 


8 6), 


( 0. 


9 8, 


- 0. 


1 7) 


( 


0. 


3 0, 


0. 


9 5), 


(-0. 


5 3, 


- 0. 


8 4) 


( 


- 0. 


9 4, 


-0. 


3 4), 


( 0. 


7 0, 


0. 


7 1) 


( 


- 0. 


2 2, 


0. 


9 7), 


( 0. 


3 8, 


-0. 


9 2) 



Further, a counter (random phase counter) for using 
the phase data should have been stored in the random phase 

10 storage section 287 too. This value should have been 
initialized to 0 before storage. 

Next, the static RAM area is set. Specifically, the 
noise cancellation coefficient storage section 273, the 
noise spectrum storage section 285, the previous spectrum 

15 storage section 286, the previous waveform' storage section 
288 and the maximum power storage section 289 are cleared. 
The following will discuss the individual storage sections 
and a setting example. 

The noise cancellation coefficient storage section 273 

20 is an area for storing a noise cancellation coefficient 
whose initial value stored is 20.0. The noise spectrum 
storage section 285 is an area for storing, for each 
frequency, mean noise power, a mean noise spectrum, a 
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compensation noise spectrum for the first candidate, a 
compensation noise spectrum for the second candidate, and a 
frame number (sustaining number) indicating how many frames 
earlier the spectrum value of each frequency has changed; a 
5 sufficiently large value for the mean noise power, 

designated minimum power for the mean noise spectrum, and 
sufficiently large values for the compensation noise 
spectra and the sustaining number should be stored as 
initial values. 

10 The previous spectrum storage section 286 is an area 

for storing compensation noise power, power (full range, 
intermediate range) of a previous frame (previous frame 
power), smoothing power (full range, intermediate range) of 
a previous frame (previous smoothing power), and a noise 

15 sequence number; a sufficiently large value for the 

compensation noise power, 0.0 for both the previous frame 
power and full frame smoothing power and a noise reference 
sequence number as the noise sequence number should be 
stored. 

20 The previous waveform storage section 288 is an area 

for storing data of the output signal of the previous frame 
by the length of the last pre-read data for matching of the 
output signal, and all 0 should be stored as an initial 
value. The spectrum enhancing section 281, which executes 

25 ARMA and high-frequency enhancement filtering, should have 
the statuses of the respective filters cleared to 0 for 
that purpose- The maximum power storage section 289 is an 
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area for storing the maximum power of the input signal, and 
should have 0 stored as the maximum power. 

Then, the noise cancellation algorithm will be 
explained block by block with reference to FIG. 27. 

First, an analog input signal 271 including a speech 
is subjected to A/D conversion in the A/D converter 272, 
and is input by one frame length + pre-read data length 
(160 + 80 = 240 points in the above setting example). The 
noise cancellation coefficient adjusting section 274 
computes a noise cancellation coefficient and a 
compensation coefficient from an equation 49 based on the 
noise cancellation coefficient stored in the noise 
cancellation coefficient storage section. 273, a designated 
noise cancellation coefficient, a learning coefficient for 
the noise cancellation coefficient, and a compensation 
power increase coefficient. The obtained noise cancellation 
coefficient is stored in the noise cancellation coefficient 
storage section 273, the input signal obtained by the A/D 
converter 272 is sent to the input waveform setting section 
275, and the compensation coefficient and noise 
cancellation coefficient are sent to the noise estimating 
section 284 and the noise canceling/ spectrum compensating 
section 278. 



q = qXc + QX(1 - G) 

r = Q/qXD (49) 
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where q: noise cancellation coefficient 

Q: designated noise cancellation coefficient 
C: learning coefficient for the noise cancellation 
coefficient 

r: compensation coefficient 
D: compensation power increase coefficient. 
The noise cancellation coefficient is a coefficient 
indicating a rate of decreasing noise, the designated noise 
cancellation coefficient is a fixed coefficient previously 
designated, the learning coefficient for the noise 
cancellation coefficient is a coefficient indicating a rate 
by which the noise cancellation coefficient approaches the 
designated noise cancellation coefficient, the compensation 
coefficient is a coefficient for adjusting the compensation- 
power in the spectrum compensation, and the compensation 
power increase coefficient is a coefficient for adjusting 
the compensation coefficient. 

In the input waveform setting section 275, the input 
signal from the A/D converter 272 is written in a memory 
arrangement having a length of 2 to an exponential power 
from the end in such a way that FFT (Fast Fourier 
Transform) can be carried out. 0 should be filled in the 
front portion. In the above setting example, 0 is written 
in 0 to 15 in the arrangement with a length of 256, and the 
input signal is written in 16 to 255. This arrangement is 
used as a real number portion in FFT of the eighth order. 
An arrangement having the same length as the real number 
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portion is prepared for an imaginary number portion, and 
all 0 should be written there. 

In the LPC analyzing section 276, a hamming window is 
put on the real number area set in the input waveform 
setting section 275, autocorrelation analysis is performed 
on the Hamming -windowed waveform to acquire an 
autocorrelation value, and autocorrelation-based LPC 
analysis is performed to acquire linear predictive 
coefficients. Further, the obtained linear predictive 
coefficients are sent to the spectrum enhancing section 281. 

The Fourier transform section 277 conducts discrete 
Fourier transform by FFT using the memory arrangement of 
the real number portion and the imaginary number portion, 
obtained by the input waveform setting section 275. The sum 
of the absolute values of the real number portion and the 
imaginary number portion of the obtained complex spectrum 
is computed to acquire the pseudo amplitude spectrum (input 
spectrum hereinafter) of the input signal. Further, the 
total sum of the input spectrum value of each frequency 
(input power hereinafter) is obtained and sent to the noise 
estimating section 284. The complex spectrum itself is sent 
to the spectrum stabilizing section 279. 

A process in the noise estimating section 284 will now 
be discussed. 

The noise estimating section 284 compares the input 
power obtained by the Fourier transform section 277 with 
the maximum power value stored in the maximum power storage 
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section 289, and stores the maximum power value as the 
input power value in the maximum power storage section 289 
when the maximum power is smaller. If at least one of the 
following cases is satisfied, noise estimation, is performed, 
and if none of them are met, noise! estimation is not 
carried out . 

(1) The input power is smaller than the maximum power 
multiplied by an unvoiced segment detection coefficient. 

(2) The noise cancellation coefficient is larger than 
the designated noise cancellation coefficient plus 0.2. 

(3) The input power is smaller than a value obtained 
by multiplying the mean noise power, obtained from the 
noise spectrum storage section 285, by 1.6. 

The noise estimating algorithm in the noise estimating 
section 284 will now be discussed. 

First, the sustaining numbers of all the frequencies 
for the first and second candidates stored in the noise 
spectrum storage section 285 are updated (incremented by 1). 
Then, the sustaining number of each frequency for the first 
candidate is checked, and when it is larger than a 
previously set sustaining number of noise spectrum 
reference, the compensation spectrum and sustaining number 
for the second candidate are set as those for the first 
candidate, and the compensation spectrum of the second 
candidate is set as that of the third candidate and the 
sustaining number is set to 0. Note that in replacement of 
the compensation spectrum of the second, candidate, the 
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memory can be saved by not storing the third candidate and 
substituting a value slightly larger than the second 
candidate. In this mode, a spectrum which is 1.4 times 
greater than the compensation spectrum of the second 
candidate is substituted. 

After renewing the sustaining number, the compensation 
noise spectrum is compared with the input spectrum for each 
frequency. First, the input spectrum of each frequency is 
compared with the compensation nose spectrum of the first 
candidate, and when the input spectrum is smaller, the 
compensation noise spectrum and sustaining number for the 
first candidate are set as those for the second candidate, 
and the input spectrum is set as the compensation spectrum 
of the first candidate with the sustaining number set to 0. 
In other cases than the mentioned condition, the input 
spectrum is compared with the compensation nose spectrum of 
the second candidate, and when the input spectrum is 
smaller, the input spectrum is set as the compensation 
spectrum of the second candidate with the sustaining number 
set to 0. Then, the obtained compensation spectra and 
sustaining numbers of the first and second candidates are 
stored in the noise spectrum storage section 285. At the 
same time, the mean noise spectrum is updated according to 
the following equation 50. 

Si = SiXg + SiX(l - g) (50). 
where s: means noise spectrum 
S: input spectrum 



g: 0.9 (when the input power is larger than a half the 
mean noise power) 

0.5 (when the input power is equal to or smaller 
than a half the mean noise power) 

i: number of the frequency. 

The mean noise spectrum is pseudo mean noise spectrum, 
and the coefficient g in the equation 50 is for adjusting 
the speed of learning the mean noise spectrum. That is, the 
coefficient has such an effect that when the input power is 
smaller than the noise power, it is likely to be a noise- 
only segment so that the learning speed will be increased, 
and otherwise, it is likely to be in a speech segment so 
that the learning speed will be reduced. 

Then, the' total of the values of the individual 
frequencies of the mean noise spectrum is obtained to be 
the mean noise power. The compensation noise spectrum, mean 
noise spectrum and mean noise power are stored in the noise 
spectrum storage section 285. 

' In the above noise estimating process, the capacity of 
the RAM constituting the noise spectrum storage section 28.5 
can be saved by making a noise spectrum of one frequency 
correspond to the input spectra of a plurality of 
frequencies. As one example is illustrated the RAM capacity 
of the noise spectrum storage section 285 at the time of 
estimating a noise spectrum of one frequency from the input 
spectra of four frequencies with FFT of 256 points in this 
mode used. In consideration of the (pseudo) amplitude 
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spectrum being horizontally symmetrical with respect to the 
frequency axis, to make estimation for all the frequencies, 
spectra of 128 frequencies and 128 sustaining numbers are 
stored, thus requiring the RAM capacity of a total of 768 W 
or 128 (frequencies) X2 (spectrum and sustaining number) X 
3 (first and second candidates for compensation and mean). 

When a noise spectrum of one frequency is made to 
correspond to input spectra of four frequencies , by 
contrast, the required RAM capacity is a total of 192 W or 
32 (frequencies) X2 (spectrum and sustaining number) X3 
(first and second candidates for compensation and mean). In 
this case, it has been confirmed through experiments that 
for the above 1 X4 case, the performance is hardly 
deteriorated while the frequency resolution of the noise 
spectrum decreases. Because this means is not for 
estimation of a noise spectrum from a spectrum of one 
frequency, it has an effect of preventing the spectrum from 
being erroneous estimated as a noise spectrum when a normal 
sound (sine wave, vowel or the like) continues for a long 

period of time. 

A description will now be given of a process in the 
noise canceling/ spectrum compensating section 278. 

A result of multiplying the mean noise spectrum, 
stored in the noise spectrum storage section 285, by the 
noise cancellation coefficient obtained by the noise 
cancellation coefficient adjusting section 274 is 
subtracted from the input spectrum (spectrum difference 
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hereinafter). When the RAM capacity of the noise spectrum 
storage section 285 is saved as described in the 
explanation of the noise estimating section 284, a result 
of multiplying a mean noise spectrum of a frequency 
5 corresponding to the input spectrum by the noise 

cancellation coefficient is subtracted. When the spectrum 
difference becomes negative, compensation is carried out by 
setting a value obtained by multiplying the first candidate 
of the compensation noise spectrum stored in the noise 

10 spectrum storage section 285 by the compensation 

coefficient obtained by the noise cancellation coefficient 
adjusting section 274. This is performed for every 
frequency. Further, flag data is prepared for each 
frequency so that the frequency by which the spectrum 

15 difference has been compensated can be grasped. For example, 
there is one area for each frequency, and 0 is set in case 
of no compensation, and 1 is set when compensation has been 
carried out. This flag data is sent together with the 
spectrum difference to the spectrum stabilizing section 279. 

20 Furthermore, the total number of the compensated 

(compensation number) is acquired by checking the values of 
the flag data, and it is sent to the spectrum stabilizing 
section 279 too. 

A process in the spectrum stabilizing section 279 will 

25 be discussed below. This process serves to reduce allophone 
feeling mainly of a segment which does not contain speeches. 
First, the sum of the spectrum differences of the 
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individual frequencies obtained from the noise 
canceling/ spectrum compensating section 278 is computed to 
obtain two kinds of current frame powers, one for the full 
range and the other for the intermediate range. For the 
5 full range, the current frame power is obtained for all the 
frequencies (called the full range; 0 to 128 in this mode). 
For the intermediate range, the current frame power is 
obtained for an perpetually important, intermediate band 
(called the intermediate range; 16 to 79 in this mode). 

10 Likewise, the sum of the compensation noise spectra 

for the first candidate, stored in the noise spectrum 
storage section 285, is acquired as current frame noise 
power (full range, intermediate range). When the values of 
the compensation numbers obtained from the noise 

15 canceling/ spectrum compensating section 278 are checked and 
are sufficiently large, and when at least one of the 
following three conditions is met, the current frame is 
determined as a noise-only segment and a spectrum 
stabilizing process is performed. 

20 (1) The input power is smaller than the maximum power 

multiplied by an unvoiced segment detection coefficient. 

(2) The current frame power (intermediate range) is 
smaller than the current frame noise power (intermediate 
range) multiplied by 5.0. 

25 (3) The input power is smaller than noise reference 

power . 

In a case where no stabilizing process is not 
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conducted, the consecutive noise number stored in the 
previous spectrum storage section 286 is decremented by 1 
when it is positive, and the current frame noise power 
(full range, intermediate range) is set as the previous 
frame power (full range, intermediate range) and they are 
stored in the previous spectrum storage section 286 before 
proceeding to the phase diffusion process. 

The spectrum stabilizing process will now be discussed. 
The purpose for this process is to stabilize the spectrum 
in an unvoiced segment (speech-less and noise-only segment) 
and reduce the power. There are two kinds of processes, and 
a process 1 is performed when the consecutive noise number 
is smaller than the number of consecutive noise references 
while a process 2 is performed otherwise. The two processes 
will be described as follow. 
(Process 1) 

The consecutive noise number stored in the previous 
spectrum storage section 286 is incremented by 1, and the 
current frame noise power (full range, intermediate range) 
is set as the previous frame power (full range, 
intermediate range) and they are stored in the previous 
spectrum storage section 286 before proceeding to the phase 
adjusting process. 
(Process 2) 

The previous frame power, the previous frame smoothing 
power and the unvoiced segment power reduction coefficient, 
stored in the previous spectrum storage section 286, are 



referred to and are changed according to an equation 51. 

Dd80 = Dd80X0.8 + A80Xo.2Xp 

D80 = D80X0.5 + Dd80Xo.5 

Ddl29 = Ddl29X0.8 + A129Xo.2Xp (51) 

D129 * D129X0.5 + Ddl29X0.5 

where Dd80: previous frame smoothing power (intermediate 
range ) 

D80: previous frame power (intermediate range) 
Ddl29: previous frame smoothing power (full range) 
D129: previous frame power (full range) 
A80: current frame noise power (intermediate range) 
A129: current frame noise power (full range). 
Then, those powers are reflected on the spectrum 
differences. Therefore, two coefficients, one to be 
multiplied in the intermediate range (coefficient 1 
hereinafter) and the other to be multiplied in the full 
range (coefficient 2 hereinafter), are computed. First, the 
coefficient 1 is computed from an equation 52. 

rl = D80/A80 (when A80 > 0) 

1.0 (when A80 ^ 0) (52) 
where rl: coefficient 1 

D80: previous frame power (intermediate range) 

A80: current frame noise power (intermediate range). 

As the coefficient 2 is influenced by the coefficient 
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1, acquisition means becomes slightly complicated. The 
procedures will be illustrated below. 

(1) When the previous frame smoothing power (full 
range) is smaller than the previous frame power 

5 (intermediate range) or when the current frame noise power 
(full range) is smaller than the current frame noise power 
(intermediate range), the flow goes to (2), but goes to (3) 
otherwise. 

(2) The coefficient 2 is set to 0.0, and the previous 
10 frame power (full range) is set as the previous frame power 

(intermediate range), then the flow goes to (6). 

(3) When the current frame noise power (full range) is 
equal to the current frame noise power (intermediate range), 
the flow goes to (4), but goes to (5) otherwise. 

15 (4) The coefficient 2 is set to 1.0, and then the flow 

goes to (6) . 

(5) The coefficient 2 is acquired from the following 
equation 53, and then the flow goes to (6). 

20 r2 = (D129 - D80)/(A129 - A80) (53) 

where r2 : coefficient 2 

D129: previous frame power (full range) 
D80: previous frame power (intermediate range) 
25 A129: current frame noise power (full range) 

A80: current frame noise power (intermediate range). 

(6) The computation of the coefficient 2 is terminated. 
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The coefficients 1 and 2 obtained in the above 
algorithm always have their upper limits clipped to 1.0 and 
lower limits to' the unvoiced segment power reduction 
coefficient. A value obtained by multiplying the spectrum 
5 difference of the intermediate frequency (16 to 79 in this 
example) by the coefficient 1 is set as a spectrum 
difference, and a value obtained by multiplying the 
spectrum difference of the frequency excluding the 
intermediate range from the full range of that spectrum 
10 difference (0 to 15 and 80 to 128 in this example) by the 

coefficient 2 is set as a spectrum difference. Accordingly, 
the previous frame power (full range, intermediate range) 
is converted by the following equation 54. 

15 D80 = A80Xrl 

D129 = D80 + (A129 - A80)Xr2 (54) 

where rl: coefficient 1 
r2: coefficient 2 

20 D80: previous frame power (intermediate range) 

A80: current frame noise power (intermediate range) 
D129: previous frame power (full range) 
A129: current frame noise power (full range). 
Various sorts of power data, etc. obtained in this 

25 manner are all stored in the previous spectrum storage 
section 286 and the process 2 is then terminated. 

The spectrum stabilization by the spectrum stabilizing 



section 279 is carried out in the above manner. 

Next, the phase adjusting process will be explained. 
While the phase is not changed in principle in the 
conventional spectrum subtraction, a process of altering 
the phase at random is executed when the spectrum of that 
frequency is compensated at the time of cancellation. This 
process enhances the randomness of the remaining noise, 
yielding such an effect of making is difficult to give a 
perpetually adverse impression. 

First, the random phase counter stored in the random 
phase storage section 287 is obtained. Then, the flag data 
(indicating the presence/absence of compensation) of all 
the frequencies are referred to, and the phase of the 
complex spectrum obtained by the Fourier transform section 
277 is rotated using the following equation 55 when 
compensation has been performed. 
Bs = SiXRc - TiXRc + 1 
Bt = SiXRc + 1 + TiXRc 
Si- Bs (55) 
Ti = Bt 

where Si, Ti: complex spectrum 

i: index indicating the frequency 

R: random phase data 

c: random phase counter 

Bs, Bt: register for computation. 

In the equation 55, two random phase data are used in 
pair. Every time the process is performed once, the random 
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phase counter is incremented by 2, and is set to 0 when it 
reaches the upper limit (16 in this mode). The random phase 
counter is stored in the random phase storage section 287 
and the acquired complex spectrum is sent to the inverse 
5 Fourier transform section 280. Further, the total of the 
spectrum differences (spectrum difference power 
hereinafter) and it is sent to the spectrum enhancing 
section 281 . 

The inverse Fourier transform section 280 constructs a 
10 new complex spectrum based on the amplitude of the spectrum 
£ difference and the phase of the complex spectrum, obtained 

^ by the spectrum stabilizing section 279, and carries out 

B inverse Fourier transform using FFT. (The yielded signal is 

m 

yl called, a first order output signal.) The obtained first 

O 15 order output signal is sent to the spectrum enhancing 

hj section 281. 

□ Next, a process in the spectrum enhancing section 281 

will be discussed. 

First, the mean noise power stored in the noise 

20 spectrum storage section 285, the spectrum difference power 
obtained by the spectrum stabilizing section 279 and the 
noise reference power, which is constant, are referred to 
select an MA enhancement coefficient and AR enhancement 
coefficient. The selection is implemented by evaluating the 

25 following two conditions. 
(Condition 1) 

The spectrum difference power is greater than a value 
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obtained by multiplying the mean noise power, stored in the 
noise spectrum storage section 285, by 0.6, and the mean 
noise power is greater than the noise reference power. 
(Condition 2) 

5 The spectrum difference power is greater than the mean 

noise power. 

When the condition 1 is met, this segment is a "voiced 
segment," the MA enhancement coefficient is set to an MA 
enhancement coefficient 1-1, the AR enhancement coefficient 

10 is set to an AR enhancement coefficient 1-1, and a high- 
frequency enhancement coefficient is set to a high- 
frequency enhancement coefficient 1. When the condition 1 
is not satisfied but the condition 2 is met, this segment 
is an "unvoiced segment," the MA enhancement coefficient is 

15 set to an MA enhancement coefficient 1-0, the AR 

enhancement coefficient is set to an AR enhancement 
coefficient 1-0, and the high-frequency enhancement 
coefficient is set to 0. When the condition 1 is satisfied 
but the condition 2 is not, this segment is an "unvoiced, 

20 noise-only segment," the MA enhancement coefficient is set 
to an MA enhancement coefficient 0, the AR enhancement 
coefficient is set to an AR enhancement coefficient 0, and 
the high-frequency enhancement coefficient is set to a 
high-frequency enhancement coefficient 01 

25 Using the linear predictive coefficients obtained from 

the LPC analyzing section 276, the MA enhancement 
coefficient and the AR enhancement coefficient, an MA 
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coefficient AR coefficient of an extreme enhancement filter 
are computed based on the following equation 56. 

a (ma)i = aiX 

a (ar)i = u±X 7* (56) 

where a(ma)i: MA coefficient 
a(ar)i: AR . coefficient 
ai: linear predictive coefficient 
£ : MA enhancement coefficient 
7 : AR enhancement coefficient 
i: number. 

Then, the first order output signal acquired by the 
inverse Fourier transform section 280 is put through the 
extreme enhancement filter using the MA coefficient and AR 
coefficient. The transfer function of this filter is given 
by the following equation 57. 

l + a(ma) x xZ" 1 +a(ma) 2 x Z' 2 '^-^a{ma)jxZr^ 

. — — ■ — ■ ~ (57) 

l + a(ar\ xZ" 1 +a(ar) 1 *Z' 2 +-"+a(ar) J xZ' J 
where a(ma) : MA coefficient 
a(ar) : AR coefficient 
j : order . 

Further, to enhance the high frequency component, 
high-frequency enhancement filtering is performed by using 
the high-frequency enhancement coefficient. The transfer 
function of this f^Jter . is given by the following equation 
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1 - 5Z" 1 (58) 
where <5 : high-frequency enhancement coefficient. 

A signal obtained through the above process is called 
a second order output signal. The filter status is saved in 
the spectrum enhancing section 281. 

Finally, the waveform matching section 282 makes the 
second order output signal, obtained by the spectrum 
enhancing section 281, and the signal stored in the 
previous waveform storage section 288, overlap one on the 
other with a triangular window. Further, data of this 
output signal by the length of the last pre-read data is 
stored in the previous waveform storage section 288. A 
matching scheme at this time is shown by the following 
equation 59. 

j) x Z )/L (j - 0 ~ L - 1) 
(j = L ~ L-7-M - 1) 
(j = 0 ~ L - 1) (59) 

where 0 : output signal 
i 

D : second order output signal 

3 

. Z : output signal 
i 

L: pre-read data length 
M: frame length. 

It is to be noted that while data of the pre-read date 
length + frame length is output as the output signal, that 



0^ = (j x D } + (L 



J i 
Z = 0 
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of the output signal which can be handled as a signal is 
only a segment of the frame length from the beginning of 
the data. This is because, later data of the pre-read data 
length will be rewritten when the next output signal is 
output. Because continuity is compensated in the entire 
segments of the output signal, however, the data can be 
used in frequency analysis, such as LPC analysis or filter 
analysis. 

According to this mode, noise spectrum estimation can 
be conducted for a segment outside a voiced segment as well 
as in a voiced segment, so that a noise spectrum can be 
estimated even when it is not clear at which timing a 
speech is present in data. 

It is possible to enhance the characteristic of the 
input spectrum envelope with the linear predictive 
coefficients, and to possible to prevent degradation of the 
sound quality even when the noise level is high. 

Further, using the mean spectrum of noise can cancel 
the noise spectrum more significantly. Further, separate 
estimation of the compensation spectrum can ensure more 
accurate compensation. 

It is possible to smooth a spectrum in a noise-only 
segment where no speech is contained, and the spectrum in 
this segment can prevent allophone feeling from being 
caused by an extreme spectrum variation which is originated 
from noise cancellation. 

The phase of the compensated frequency component can 
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be given a random property, so that noise remaining 
uncanceled can be converted to noise which gives less 
perpetual allophone feeling. 

The proper weighting can perpetually be given in a 
5 voiced segment, and perpetual-weighting originating 

allophone feeling can be suppressed in an unvoiced segment 
or an unvoiced syllable segment. 

Industrial Applicability 

As apparent from the above, an excitation vector 
generator, a speech coder and speech decoder according to 
this invention are effective in searching for excitation 
vectors and are suitable for improving the speech quality. 




